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Foreword 


Following the aims of the Methodos Series perfectly, this 13th volume on agent- 
based models provides a general view of the problems raised by this approach and 
shows how these problems may be solved. 

These methods are derived from computer simulation studies used by mathe- 
maticians and physicists. They are now applied in many social disciplines such as 
artificial life (Alife), political sciences, evolutionary psychology, demography, and 
many others. Those who introduced them often took care not to consider each social 
science separately but to view them as a whole, incorporating a wide spectrum of 
social processes — demographic, economic, sociological, political, and so on. 

Rather than modelling specific data, this approach models theoretical ideas and 
is based on computer simulation. Its aim is to understand how the behaviour of 
biological, social, or more complex systems arises from the characteristics of the 
individuals or agents composing the said system. As Billari and Prskawetz (2003, 
p. 42) said, 


Different to the approach of experimental economics and other fields of behavioural science 
that aim to understand why specific rules are applied by humans, agent-based computational 
models pre-suppose rules of behaviour and verify whether these micro based rules can 
explain macroscopic regularities. 


This is, therefore, a bottom-up approach, with population-level behaviour emerg- 
ing from rules of behaviour of autonomous individuals. These rules need to be 
clearly discussed; unfortunately, this approach is now used without sufficient 
discussions in many social sciences. It eliminates the need for empirical data on 
personal or social characteristics to explain a phenomenon, as it is based on simple 
decision-making rules followed by individuals, which can explain some real-world 
phenomena. But how can we find these rules? As Burch (2003, p. 251) puts it, 

A model explains some real-world phenomenon if a) the model is appropriate to the real- 

world system [...] and b) if the model logically implies the phenomenon, in other words, if 


the phenomenon follows logically from the model as specified to fit a particular part of the 
real world. 
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Also, a theoretical model of this kind cannot be validated in the same way as an 
empirical model with the “covering law” approach, which hinders social research 
and leads to a pessimistic view of the explanatory power of the social sciences. In 
Franck’s words (Franck 2002, p. 289), 


But, one has ceased to credit deduction with the power of explaining phenomena. 
Explaining phenomena means discovering principles which are implied by the phenomena. 
It does not mean discovering phenomena which are implied by the principles. 


As the agent-based approach focuses on the mechanisms driving the actions 
of individuals or agents, it will simulate the evolution of such a population from 
simple rules of behaviour. It may thus use game theory, complex systems theory, 
emergence, evolutionary programming and — to introduce randomness — Monte 
Carlo methods. It may also use survey data, not to explain the phenomenon studied, 
but only to verify if the parameters used in the simulation lead to a behaviour similar 
to the one observed in the survey. 

As we have already said, such an approach raises many problems which this 
volume will try to answer. We will present here these main problems, letting the 
reader see how Silverman has treated it. 

The first problem is that these models “are intended to represent the import and 
impact of individual actions on the macro-level patterns observed in a complex 
system” (Courgeau et al. 2017, p. 38). This implies that a phenomenon emerging 
at the aggregate level can be entirely explained by individual behaviour. Holland 
(2012, p. 48), however, states that agent-based models include “little provision for 
agent conglomerates that provide building blocks and behaviour at a higher level 
of organisation.” For instance, a multilevel study on the effects of an individual 
characteristic (being a farmer) and the corresponding aggregate characteristic (the 
proportion of farmers living in an area) on the probability of internal migration in 
Norway shows that the effects are contradictory (Courgeau 2007): it seems hard 
to explain a macro-characteristic acting positively by a micro-characteristic acting 
negatively. In fact, micro-level rules are often hard to link to aggregate-level rules, 
and I believe that aggregate-level rules cannot be modelled with a purely micro 
approach, for they transcend the behaviours of the component agents. 

The second problem is that this approach is basically bottom-up. However, it 
seems important to take into consideration simultaneously a top-down process from 
higher-level properties to lower-level entities. More specifically, we should speak 
of a micro-macro link (Conte et al. 2012, p. 336) that “is the loop process by 
which behaviour at the individual level generates higher-level structures (bottom- 
up process), which feedback to the lower level (top-down), sometimes reinforcing 
the producing behaviour either directly or indirectly”. The bottom-up approach of 
a standard agent-based model cannot take such a reciprocal micro-macro link into 
account, given that it only simulates one level of analysis. 

The third problem concerns the validation of an agent-based model. Such an 
approach imitates human behaviour using some well-chosen mechanisms. It may 
be judged successful when it accurately reproduces the structure of this behaviour. 
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Determining success, however, requires a method very different from the standard 
tests used to verify the validity of the effects of different characteristics in the 
other approaches. Such tests can be performed in the natural sciences but are more 
difficult in the social sciences. As Küppers and Lenhard observe (Günter et al. 2005, 
paragraph 1.3), 

The reliability of the knowledge produced by computer simulation is taken for granted if 

the physical model is correct. In the second case of social simulations in general there is 

no theoretical model on which one could rely. The knowledge produced in this case seems 


to be valid if some characteristic of the social dynamics known from experience with the 
social world are reproduced by the simulation. 


To determine if such an exploration has been successful, we need to consider 
different aspects. First, how do we test that there are no other models offering a 
better explanation of the observed phenomenon? Researchers often try out different 
kinds of models so they can choose the one most consistent with empirical data. But 
this hardly solves the problem, as there is an infinity of models that can predict the 
same empirical result as well or even better. Second, how do we test that the chosen 
model has a good fit with the observed data? Unfortunately, there is no clearly 
defined procedure for testing the fit of a simulation model, such as significance 
tests for the approaches described earlier. We can conclude that there are no clear 
verification and validation procedures for agent-based models in the social sciences. 

While the agent-based approach appears to resemble event-history analysis, for it 
focuses on individual behaviour, it nevertheless aims to explain collective behaviour. 
At that point, the key question is: how do we generate macroscopic regularity using 
simple individual rules? Conte et al. (2012, p. 340) perfectly describe the difficulties 
encountered: 


First, how to find out the simple local rules? How to avoid ad hoc and arbitrary 
explanations? As already observed, one criterion has often been used, i.e., choose the 
conditions that are sufficient to generate a given effect. However, this leads to a great deal 
of alternative options, all of which are to some extent arbitrary. 


Without factoring in the influence of networks on individual behaviour, we can 
hardly obtain a macro behaviour merely by aggregating individual behaviours. 
To obtain more satisfactory models, we must introduce decision-making theories. 
Unfortunately, the choice of theory is influenced by the researcher’s discipline and 
can produce highly divergent results for the same phenomenon studied. 

In order to go further, Chap. 9, co-authored by Jakub Bijak, Daniel Courgeau, 
Robert Franck and Eric Silverman, proposes for demography the enlargement of 
agent-based models to a model-based research. This will not be a new paradigm in 
the traditional sense, as with the cross-sectional, the cohort, the event-history and the 
multilevel approaches, but a new way to overcome the limitations of demographic 
knowledge. It is a research programme which adds a new avenue of empirical 
relevance to demographic research. The examples given in the following chapters, 
despite the simplicity of the models used, give us a glimpse of the importance of 
model-based demography. 
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I hope I have given to the reader of this volume a clear idea of its importance for 
social sciences. 


Mougins, France Daniel Courgeau 
August 2017 
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Acronyms 


In general I have attempted to keep this volume free of excessive abbreviations, but 
the acronyms below will appear at times given their widespread usage in related 
fields of research. 


ABCD Agent-Based Computational Demography. This term describes an 
approach to the discipline of demography which incorporates agent-based 
modelling. 

ABM Agent-Based Model. These are computer simulations designed to examine 
the behaviour and interactions of autonomous agents. 

ABSS Agent-Based Social Simulation. An approach to social simulation which 
explicitly focuses on the use of agent-based models. 
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Part I 
Agent-Based Models 


This first part of the text examines the theory and practice of computational 
modelling, much of it viewed through the lens of artificial life. Artificial life, or 
Alife for short, is a discipline that focuses on ‘the simulation and synthesis of living 
systems’, most frequently through the use of simulation. Using agent-based models, 
evolutionary algorithms, and other recent innovations in simulation and robotics, 
Alife researchers hope to unravel the mysteries of how life develops and evolves in 
the real world by studying digital manifestations of ‘life as it could be’. 

As Alife has itself evolved over the years, researchers have sought closer 
connections to the real world and to real biology. This has brought about difficult 
questions concerning the integration of real data into simulated worlds, and the 
relationship between digital biology and physical biology. As early Alife has slowly 
given way to a greater desire for empirical relevance, it has become increasingly 
important to understand the potential role Alife and Alife-inspired approaches can 
play in understanding real biological systems. 

Of course the difficulties inherent in modelling complex biological processes 
and populations are familiar to population biologists just as much as digital ones 
— perhaps even more so, given the short history of Alife. We will examine in 
detail some of the theoretical frameworks developed by population biologists in 
order to develop their models and position them as a valid form of enquiry, 
and investigate how we might use these frameworks as a way to understand and 
categorise computational modelling efforts in disciplines such as Alife. In so doing, 
we will lay the foundations for Part I, in which we will investigate the additional 
modelling complexities we encounter when we begin to model social systems. 


Chapter 1 
Introduction 


1.1 Overview 


As computer simulation has developed as a methodology, so its range of applications 
has grown across different fields. Beyond the use of mathematical models for 
physics and engineering, simulation is now used to investigate fields as varied and 
disparate as political science, psychology, evolutionary biology, and many other 
disciplines. 

With simulation becoming such a common adjunct to conventional empirical 
research, debate regarding the methodological merits of computer simulation 
continues to develop. Some fields, artificial life being the primary example used 
in this text, have developed using computer simulation as a central driving force. 
In such a case, researchers have developed theoretical frameworks to delineate the 
function and purpose of computer simulation within their field of study. 

However, the expansion of computer simulation into fields which use empirical 
study as a central methodology means that new frameworks for the appropriate 
use of simulation must develop. How might simulation enhance one’s use of 
conventional empirical data? Can simulations provide additions to empirically- 
collected data-sets, or must simulation data be treated entirely differently? How 
does theoretical bias influence the results of a simulation, and how can such biases 
be investigated and accounted for? 

The central goal of this text is to investigate these increasingly important 
concerns within the context of simulation for the social sciences. Agent-based 
models in particular have become a popular method for testing sociological 
hypotheses that are otherwise difficult or impossible to analyse empirically, and 
as such a methodological examination of social simulations becomes critical as 
social scientists begin to use such models to influence social policy. Without a clear 
understanding of the relationship between social simulation and social sciences as a 
whole, the use of models to explain social phenomena becomes difficult to justify. 
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Bearing in mind this central theme, this text will utilise a modelling example 
which will be revisited regularly in Parts I and II. This example will serve as a 
means for illustrating the important concepts described in the various modelling 
frameworks under discussion, and for tying together these frameworks by showing 
the effect of each upon the construction and implementation of a simulation. This 
central example takes the form of a model of bird migration; this example seemed 
most appropriate as this sort of problem can be examined through various modelling 
means, from mathematical to agent-based computational models. The context and 
purpose of this hypothetical model will vary from example to example, but the 
central concern of developing an understanding of the behaviour of migratory birds 
will remain throughout. 

Toward the latter half of Part II, we will use the classic example of Schelling’s 
residential segregation model (Schelling 1971) to discuss some particular method- 
ological points in detail. Part III will delve deeply into specific examples of 
agent-based modelling work in the field of demography in order to illustrate how 
the modelling concepts discussed in Parts I and II can influence the practice of 
modelling in social science. 


1.2 Artificial Life as Digital Biology 


The field of artificial life provides a useful example of the development of theoretical 
frameworks to underwrite the use of simulation models in research. The Artificial 
Life conference bills itself as a gathering to discuss ‘the simulation and synthesis 
of living systems’; with such potentially grandiose claims about the importance of 
artificial life simulations, theoretical debate within the field has been both frequent 
and fierce. 

In the early days of Alife, Langton and other progenitors of this novel research 
movement viewed simulation as a means to develop actual digital instantiations 
of living systems. Beyond being an adjunct to biology, Alife was viewed as 
digital biology, most famously described as the investigation of ‘life-as-it-could-be’ 
(Langton 1992). Ray boasted of his Tierra simulation’s explosion of varied digital 
organisms (Ray 1994), and theorists proposed this sort of digital biology as a means 
for divining the nature of living systems. 


1.2.1 Artificial Life as Empirical Data-Point 


Since these heady days Artificial life has sought more conventional forms of 
methodological justification, seeking to link simulation with more conventional 
means of data-gathering in biology. This has lead to varying forms of theoretical 
justification within Alife, ranging from further explorations of Langton’s early ideas 
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(Bedau 1998; Silverman and Bullock 2004) to the use of Alife simulation as a form 
of ‘opaque thought experiment’ (Di Paolo et al. 2000). 

Within this text, these varying theoretical frameworks for Alife will be examined 
in turn, both within the context of biology and within Alife itself. Once Alife 
seeks direct links with conventional biology, theoretical justification becomes 
correspondingly more difficult, and thus the debate must branch out into more 
in-depth discussions of biological modelling methodology. An investigation of the 
use of modelling in population biology, beginning with the somewhat-controversial 
ideas of Levins (1966, 1968) provides a means for describing and categorising the 
most important methodological elements of biological models. Having developed 
an understanding of the complex relationship between biology and Alife, we can 
then proceed to a discussion of the future of modelling within the social sciences. 


1.3 Social Simulation and Sociological Relevance 


Social simulation has appeared in the limelight within social science quite recently, 
starting with Schelling’s well-known residential segregation model (Schelling 1978) 
and continuing into Axelrod’s explorations of cooperative behaviour (Axelrod 
1984). The development of simple algorithms and rules that can describe elements 
of social behaviour has led to an increasing drive to produce simulations of social 
systems, in the hopes that such systems can provide insight into the complexity of 
human society. 

The current state-of-the-art within social simulation relies upon the use of agent- 
based models similar to those popularised in Alife. Cederman’s influential book 
describing the use of such models in political science has helped to bolster an 
increasing community of modellers who hope that such individual-based simula- 
tions can reveal the emergence of higher-order complexity that we see around us in 
human society (Cederman 1997). Social science being a field where the empirical 
collection of data is already a significant difficulty, the prospect of using simulation 
to produce insights regarding the formation and evolution of human society is an 
enticing one for many. 


1.3.1 Methodological Concerns in Social Simulation 


Of course, with such possibilities comes great debate from within the social 
science community. Proponents offer varying justifications of the potential power 
of simulation in social science; Epstein echoes the Alife viewpoint by proposing 
that social simulation can provide “generative social science,’ a means to generate 
new empirical data-points (Epstein 1999). Similarly, Axelrod stresses the ability of 
social simulation to enhance conventional empirical studies (Axelrod and Tesfatsion 
2006). 
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Others however are more cautious with their endorsement of social simulation. 
Kluver and Stoica stress the difficulty in creating models consistent with social 
theory (Kliiver et al. 2003), noting that social systems do not lend themselves to the 
same hierarchical deconstruction as some other complex systems. Others theorise 
that social simulation faces the danger of incorporating vast theoretical biases into 
its models, eliminating one of the potential strengths of social models: a means for 
developing more general social theory (Silverman and Bryden 2007). 

Further examinations of these questions within this text will seek to link such 
ideas with the methodological frameworks developed within Alife modelling and 
biology. While both fields display obvious differences in both methodological and 
theoretical objectives, the philosophical difficulties facing agent-based modelling 
in these contexts are much the same. In both cases the link between empirical data- 
gathering and simulated data-generation is difficult to develop, and as a consequence 
the use of simulation can be difficult to justify without a suitable theoretical 
justification. 


1.4 Case Study: Schelling’s Residential Segregation Model 


Having developed a detailed comparison between the use of models in biology and 
social science, this text will use Schelling’s residential segregation model as a case 
study for examining the implications of the theoretical frameworks discussed and 
outlined in that comparison. Schelling’s model is famously simple, its initial version 
running on nothing more than a chequerboard, but its conclusions had a far-reaching 
impact on social theory at the time (Schelling 1978). Schelling’s ideas regarding the 
‘micromotives’ of individuals within a society, and the resulting effects upon that 
larger society, sparked extensive discussion of the role of individuals in collective 
social behaviour. 


1.4.1 Implications of Schelling’s Model 


With this in mind, our investigation will explore the reasons for Schelling’s great 
success with such a simple model, and its ramifications for future modelling 
endeavours. How did such an abstract formulation of the residential segregation 
phenomenon become so powerful? What theoretical importance did Schelling 
attribute to his model’s construction, and how did that influence his interpretation 
of the results? Finally, how does his model illuminate both the strengths and 
weaknesses of social simulation used for the purpose of developing social theory? 
All of these questions bear upon our final examination of the most appropriate 
theoretical framework for social simulation as a whole. 
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1.5 Social Simulation in Application: The Case of 
Demography 


Having developed some theoretical approaches to social simulation, we will need to 
move on to discuss the establishment of these methods as a trusted and functional 
element of the social scientist’s toolbox. We will take on this problem by investigat- 
ing the field of demography, the study of human population change. Demography 
is a fundamentally data-focused discipline, relying on at times vast amounts of 
complicated survey data to understand and predict the future development of 
populations (Silverman et al. 2011). We will investigate the core assumptions 
underlying demographic research, discuss and analyse the methodological shifts 
that have occurred in the field over the last 350 years (Courgeau et al. 2017), and 
develop a framework for a model-based demography that incorporates simulation as 
a central conceit. 


1.5.1 Building Model-Based Demography 


In order to understand the challenges facing a model-based social science, we 
will discuss several examples of agent-based approaches to demography. Starting 
with some inspirational work from the early 2000s (Billari and Prskawetz 2003; 
Axtell et al. 2002; Billari et al. 2007), we will move on to current work integrating 
statistical demographic modelling directly into an agent-based approach. We will 
examine the benefits and the shortcomings of these models, and in the process 
develop an understanding of the power of a scenario-based approach to the study 
of future population change. Finally, we will evaluate the progress of model-based 
demography thus far, and present some conclusions about the lessons we can take 
from this in our future research efforts. 


1.6 General Summary 


This text is organised as essentially a three-part argument. In Part I, the theoretical 
underpinnings of Alife are examined, and their relationship to similar modelling 
frameworks within population biology. Part II reviews the current state-of-the-art 
in simulation for the social sciences, with a view toward drawing comparisons 
with Alife methodology. A subsequent analysis of theoretical frameworks for social 
simulation as applied to a specific case study provides a means to draw these 
disparate ideas together, and develop insight into the fundamental philosophical and 
methodological concerns of simulation for the social sciences. Finally, in Part III we 
take the specific example of demographic research and attempt to build a cohesive 
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theoretical framework through which social simulation approaches can be integrated 
productively with empirically-focused social science. 


1.6.1 Alife Modelling 


This portion of the text aims first to describe the relatively new field of artificial 
life, and discuss its goals and implications. Once the background and import of 
Alife is established, then the shortcomings and theoretical pitfalls of such models 
are discussed. Given the strong association of Alife with biology and biological 
modelling, the theoretical discussion includes in-depth analysis of a framework for 
modelling in population biology proposed by Levins (1966, 1968). This analysis 
allows the theoretical implications of Alife to be placed in a broader context in 
preparation for the incorporation of further ideas from social science simulation. 


1.6.2 Simulation for the Social Sciences 


Agent-based modelling in the social sciences is a rather new development, similar 
to Alife. Social scientists may protest that modelling of various types has been 
ongoing in social science for centuries, and this is indeed true; however, this more 
recent methodology presents some similarly novel methodological and theoretical 
difficulties. This section of the text begins by describing the past and present 
of agent-based modelling in the social sciences, discussing the contributions and 
implications of each major development. Then, a discussion of current theoretical 
concerns in agent-based models for social science proceeds, describing modelling 
frameworks which attempt to categorise the various types of social simulations 
evident thus far in the field. Finally, an analysis of the problems of explanation via 
simulation which are particularly critical for the social sciences allows us to develop 
a broader understanding of these in a philosophical context. 


1.6.3 Schelling’s Model as a Case Study in Modelling 


Schelling’s model of residential segregation is notable for its impact and influence 
amongst social scientists and modellers (Schelling 1978). Despite the model’s 
simplicity, the illustration it provided of a problematic social issue provoked a 
great deal of interest, both from social scientists interested in modelling and those 
formulating empirical studies. This investigation of Schelling will focus on how 
his model surpassed its simplicity to become so influential, and how this success 
can inform our discussion of agent-based modelling as a potentially powerful 
methodology in social science. 
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1.6.4 Developing a Model-Based Demography 


Demography is an old discipline, originating from a major conceptual shift in 
the treatment of demographic events like birth, death and reproduction in the 
seventeenth century (Graunt 1662). In the years since, demography has gone 
through a series of methodological shifts, going from relatively straightforward 
early statistical work to present-day microsimulation and multilevel modelling 
approaches (Courgeau 2012). Simulation approaches to demography are now 
gaining popularity, particularly in areas such as migration, where simulation offers 
an opportunity to better understand the individual decision-making that plays a 
vital role in such processes (Anna Klabunde and Frans Willekens 2016). In Part III 
of this book, we will examine the methodological foundations of demography in 
detail, and investigate how simulation approaches can contribute to this highly 
empirical social science. We will present a proposal for a model-based approach to 
demography which attempts to resolve the conceptual gaps between the empirical 
focus of statistical demography and the explanatory and exploratory tendencies of 
social simulation. We will then discuss some applied examples of model-based 
demographic research and evaluate how these studies can influence our future efforts 
both in demography and in the social sciences more generally. 


1.6.5 General Conclusions of the Text: Messages for the 
Modeller 


By its nature, this text encompasses a number of different threads related to agent- 
based modelling to bring the reader to an understanding of both the positives and 
the negatives of this approach for the researcher who wishes to use simulation in 
the social sciences. Each of the three portions of the text builds upon the previous, 
with the goal of presenting modellers with both theoretical and practical concepts 
they can apply in their own work. Part I of the text demonstrates the problems 
and limitations of biologically-oriented agent-based models; such an approach is 
inherently theory-dependent, and modellers must be aware of this fact and justify 
the use of their model as a means to test and enhance their theories. 

Part II of the text, focusing on simulation for the social sciences, describes the 
current state of this field and the various major disputes regarding its usefulness 
to the social scientist. This new type of modelling approach provides both new 
possibilities and new problems for the social scientist; the use of simulation can be a 
difficult balancing act for the researcher who wishes to provide useful conclusions. 
Thus, the social scientist interested in modelling must be knowledgable regarding 
these methodological difficulties, as analysed here, and avoid the impulse to produce 
highly complex models which may fall foul of the guidelines discussed. 

In order to reinforce these points, we discuss an example of a powerful, 
successful, and simple model used within the social sciences: Schelling’s residential 
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segregation model (Schelling 1971, 1978). In the context of the modelling frame- 
works discussed in the previous portions, Schelling’s model provides a platform for 
examining those frameworks in a detailed fashion. Schelling’s model demonstrates 
that the most useful models are not the most complex; simplicity and analysability 
are much more valuable than complexity for those who wish to understand the 
phenomena being modelled. In essence, no model can do it all, and a knowledge of 
the modelling frameworks under discussion here and their implications allows one 
to understand the necessary balancing act of designing and implementing a model 
in much greater depth. 

Perhaps the most important balancing act related here is the tension between the 
need for a modeller to provide a theoretical backstory and the desire to minimise a 
model’s theory-dependent nature. This is a common thread running throughout the 
text, whether the model in question is related to biology or social science. Modellers 
who create a model without a theoretical backstory that provides a context may find 
themselves creating a model with no relevance except to itself, while those who 
create a model with too great a degree of theory-dependence may find themselves 
warping their model into one restricted by theoretical bias, once again moving the 
model further from real-world applicability. The notion of balancing acts in model 
creation and implementation is often practiced intuitively by modellers, but yet this 
tension between backstory and theory dependence is rarely discussed explicitly by 
modellers in the literature. 

Part III of the text brings us to the specific example of demography, a discipline 
where agent-based modelling approaches have begun to take hold in certain areas 
of enquiry. Building upon the foundations laid in previous chapters, the model- 
based demography framework described here presents a positive case-study for the 
integration of simulation with empirically-focused social science. Example models 
demonstrate how considered choices during model construction, development and 
implementation produces results that add to demographic knowledge without letting 
the simulations became unmanageable. The intention is for these models to serve as 
positive examples of pragmatic, considered modelling practices; each of them has 
limitations, but are still able to provide insight on the research questions they target. 


1.6.6 Chapter Summaries 


The analysis begins with an overall review of the philosophical issues and debates 
facing simulation science in general. Chapter 2 focuses on these general concerns, 
providing a summation of current thinking regarding issues of simulation method- 
ology. A large portion of this chapter focuses upon the problem of validation of 
simulation results, which is an issue that is of great importance to the theoretical 
frameworks under examination. A further discussion of the difficulties inherent in 
linking the artificial with the natural provides a broader philosophical context for 
the discussion. 
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Chapter 3 picks up at this point, focusing on the efforts of Alife researchers to 
make the artificial become ‘real.’ After introducing the concepts of ‘strong’ and 
‘weak’ artificial life, the significance of these two perspectives is discussed in the 
context of the still-developing philosophical debates of Alife practitioners. A central 
theme in this chapter is the drive to develop empirical Alife: simulations which can 
supplement datasets derived from real-world data. Taking into account the problems 
of validation discussed earlier and the two varying streams of Alife theory, a possible 
theoretical framework for underwriting empirical Alife is developed. 

Chapter 4 moves on to population biology, drawing upon modelling frameworks 
developed within that discipline to strengthen our burgeoning theoretical backstory 
for Alife. Levins’ three types of models, described in his seminal 1966 paper, 
provoked a great deal of debate regarding the strengths and weaknesses of modelling 
in biology, a debate which continues to rage today. After an analysis of Levins’ three 
types, an expanded version of his framework is developed in the hope of providing 
a more pragmatic theoretical position for the model-builder. 

Chapter 5 focuses mainly upon a review of the current state-of-the-art in 
simulation for the social sciences. Beginning with a look at early models, such as 
Schelling’s residential segregation model (Schelling 1978) and Axelrod’s iterated 
prisoner’s dilemma (Axelrod 1984), we move on to more current work including 
Cederman’s work within political science (Cederman 1997). This leads to a review 
of common criticisms of this growing field and the methodological peculiarities 
facing social-science modellers. These peculiarities are not limited to social sim- 
ulation, of course; social science as a whole has unique aspects to its theory and 
practice which are an important consideration for the modeller. 

Chapter 6 then proceeds with an analysis of social simulation in the context of 
the theoretical frameworks and issues laid out thus far. First, an overall analysis 
of Alife and related modelling issues in population biology gives us a set of 
frameworks useful for that particular field. Next, these theoretical concerns are 
applied to social simulation in the hope of discovering the commonalities between 
these two varieties of simulation science. This leads to a discussion of the possibility 
of using social simulation to drive innovations in social theory as a whole; the work 
of Luhmann is used as an example of one perspective that may prove valuable 
in that respect (Luhmann 1995). Finally, having placed social simulation within a 
theoretical framework, the debate regarding the usefulness of social simulation for 
social explanation is summarised and discussed. 

Chapter 7 extends the analysis begun in Chap.5 by utilising a case study: 
Schelling’s well-known residential segregation model (Schelling 1978). Schelling’s 
model is noted for its simplicity: residential segregation is illustrated by a single 
tule applied to individual agents on a simple two-dimensional grid. This chapter 
investigates the reasons behind the powerful impact of Schelling’s abstract for- 
mulation, placing the model in the theoretical constructs described thus far. The 
implications of Schelling’s model on social theory is also discussed, with reference 
to the Luhmannian modelling perspective described in the previous chapter. 
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Chapter 8 offers a conclusion to the arguments laid out in Parts I and IL. 
Having examined Alife modelling, modelling in biology, and social simulation, 
future directions for substantive modelling works are proposed. In the context of 
social simulation specifically, the problems of validation and explanation introduced 
earlier are revisited. The overall questions of methodological individualism in social 
simulation are investigated as well, with an eye toward developing methods of 
simulation which can transcend the perceived limitations on the explanatory power 
of social science models. Having used Schelling as a case study for the modelling 
frameworks under discussion, this chapter will also discuss how other modelling 
methodologies may fit cohesively into these frameworks. 

Chapter 9 marks the beginning of Part III, in which we delve into the appli- 
cation of agent-based modelling to the specific discipline of demography. This 
chapter describes the historical evolution of the field, detailing the cumulative 
development of four successive methodological paradigms. From there we propose 
a methodological framework for a model-based demography, in which simulation 
helps demographers to overcome three key epistemological challenges within 
the discipline and helps avoid the insatiable ‘beast’ of over-reliance on detailed 
demographic data. 

Chapter 10 moves beyond theoretical aspects of demography and dives into the 
practice of agent-based modelling in the field. We begin by discussing two examples 
in brief: Axtell et al.’s model of the decline of the Anasazi (Axtell et al. 2002); 
and Billari’s Wedding Ring model of partnership formation (Billari et al. 2007). 
For our third, more detailed example, we will examine the Wedding Doughnut 
— an extended version of the Wedding Ring model which incorporates statistical 
demographic methods and adds a simple representation of individual health status 
(Silverman et al. 2013a; Bijak et al. 2013). Sensitivity analysis using Gaussian 
process emulators is also introduced as a means of understanding the impact of 
model parameters on their interactions on the final output of interest. 

Chapter 11 focuses exclusively on a single model: the Linked Lives model 
of social care supply and demand (Noble et al. 2012; Silverman et al. 2013b). 
This model is a significant leap forward in complexity compared to the Wedding 
Doughnut, incorporating a simple economic system, spatial elements, partnership 
formation/dissolution, social care need and provision, and migration. We examine 
the model in detail, including another sensitivity analysis using Gaussian process 
emulators, and discuss how the strengths of this model can serve as a useful 
exemplar for future modelling efforts in demography. 

Finally, Chap. 12 summarises our findings in Part III and links them to the 
theoretical discussions presented earlier in the volume. We evaluate the current state 
of model-based demography, and discuss how the development of this approach 
can inform efforts to bring agent-based modelling to other areas of the social 
sciences. Ultimately we will take model-based demography as a positive example of 
a discipline taking new methods and weaving them gradually and thoughtfully into 
the broader tapestry of demographic research. Demography benefits particularly 
from having a cumulative approach to methodology over the last three and a half 
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centuries. Other disciplines can benefit from the insights presented by model-based 
demography, and in turn develop new approaches to simulation that may strengthen 
other areas of social science alongside demography’s focus on empirical relevance. 


1.6.7 Contributions 


The major contributions of this text lie within its philosophical and methodological 
study of modelling within both artificial life and the social sciences. These analyses 
provide a novel perspective on agent-based modelling methodologies and their 
relationship to more conventional empirical science. Other elements of the text 
present a sort of anthropological study of modelling within these areas of science, 
in the hope of providing a more cohesive view of the use and impact of simulation 
in a broader context. 

Elements of Chap.3 were based upon a work published in the proceedings 
for Artificial Life IX; this work aimed to develop a theoretical framework for 
empirical studies in Alife by providing comparison with other, more established 
fields of science. Chapter 4 was based substantially on a paper written by myself 
and Seth Bullock describing the pitfalls of an approach to modelling that relies 
upon “artificial worlds’; this work draws upon the papers of Levins, Braitenberg 
and others. Elements of Chaps.4 and 5 were drawn from a paper by myself and 
John Bryden which was published in the proceedings for The European Conference 
on Artificial Life in 2007. This paper proposed a new means of social simulation 
which could provide a deeper insight into a fundamental social theory. Chapter 9 
is based upon a collaborative paper written with Daniel Courgeau, Jakub Bijak and 
Robert Franck which was published in an edited volume on agent-based modelling 
for demography. Chapters 10 and 11 are based largely upon two collaborative 
papers written with members of the Care Life Cycle Project at the University of 
Southampton, which ran from 2010 to 2015. 

In summary, this text provides a new synthesis of theoretical and practical 
approaches to simulation science across different disciplines of the social sciences. 
By integrating perspectives from Alife, biology and social science into a single 
approach, this text provides a potential means to underwrite the use of simulation 
within these fields as a means to generate new theory and new insight. Particularly 
in fields relatively new to simulation, such as social science, the acceptance of this 
methodology as a valid means of enquiry is a slow process; this text hopes to accel- 
erate the growth of simulation with this field by providing a coherent theoretical 
background to illustrate the unique strengths of computational modelling, while 
simultaneously delineating its unique pitfalls. The detailed treatment of simulation 
modelling in demography will further illustrate how relatively disparate frameworks 
— in this case the data-centric demographic approach and the explanatory focus of 
agent-based modelling — can be combined to produce new avenues of productive 
enquiry. 
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Chapter 2 
Simulation and Artificial Life 


2.1 Overview 


Before beginning this extensive analysis of agent-based modelling, the philosoph- 
ical and methodological background of simulation science must be discussed. 
Scientific modelling is far from new; conceptual and mathematical models have 
driven research across many fields for centuries. However, the advent of computa- 
tional modelling techniques has created a new set of challenges for theorists as they 
seek to describe the advantages and limitations of this approach. 

After a brief discussion of the historical foundations of scientific modelling, 
both mathematical and computational, some distinctions that characterise modelling 
endeavours will be described. The distinction between models for science and 
engineering problems provides a useful framework in which to discuss both the 
goals of modelling and the difficult problem of validating models. These discussions 
are framed in the context of artificial life, a field which depends upon computational 
modelling as a central methodology. 

This chapter lays the groundwork for the next two chapters to come, and by 
extension Part I of this text as a whole. The discussion here of emergence, and the 
related methodologies of ‘bottom-up’ modelling, will allow us to understand the 
major philosophical issues apparent in the field of computational modelling. This, 
in turn, will prepare us for an in-depth discussion of the field of Artificial Life in 
the following chapter, which will provide a backdrop for the discussion of general 
issues in biological modelling in Chap. 4. Similarly, the discussion here of the debate 
regarding the explanatory capacity of simulation models will be a recurring thread 
throughout this text, in all three parts of the argument. 
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2.2 Introduction to Simulation Methodology 


2.2.1 The Goals of Scientific Modelling 


The construction of models of natural phenomena has been a continuous feature 
of human scientific endeavour. In order to understand the behaviour of the systems 
we observe around us, we choose to construct models to allow us to describe that 
behaviour in a simplified form. To use our central bird migration example, imagine 
that a researcher wishes to describe the general pattern of a specific bird species’ 
yearly migration. That researcher may choose to sit and observe the migrations of 
that species year-on-year for a lengthy period of time, allowing for the development 
of a database showing that species movement over time. 

However, a model of that species’ migration patterns could save our researcher a 
great deal of empirical data collection. A model which takes the collected empirical 
data and derives from it a description of the bird species’ general behaviour during 
each migration season could provide a means of prediction outside of a constant 
real-world observation of that species. Further, one can easily imagine models of the 
migration problem which allow for detailed representation of the birds’ environment 
and behaviour, allowing for a potentially deeper understanding of the causes of these 
migrations. 

Thus, in the context of this discussion, the goals of scientific modelling encom- 
pass both description and explanation. A simplified description of a phenomenon 
is inherently useful, reducing the researcher’s dependence on a continuous flow 
of collected empirical data, but explanation is the larger and more complex goal. 
Explanation implies understanding: a cohesive view of the forces and factors that 
drive the behaviour of a system. To develop that level of understanding, we must 
first understand the nature and limitations of the tools we choose to employ. 


2.2.2 Mathematical Models 


Throughout the history of science, mathematical models have been a vital part 
of developing theories and explanations of natural phenomena. From Newton’s 
laws of motion to Einstein’s general relativity, mathematical models have provided 
explicit treatments of the natural laws that govern the world around us. As we shall 
come to understand, however, these models are particularly well-suited to certain 
classes of phenomena; the concise description of natural laws that follows from a 
mathematical model becomes ever more difficult to develop as the phenomena in 
question becomes more complex. 

Even with the appealing simplicity of a mathematical description of a phe- 
nomenon however, certain methodological difficulties come into play. Some models 
may require a very complex system of linked equations to describe a system, 
creating a vast number of parameters which are not known a priori (see Chap. 4 for 
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discussion of this in relation to population biology). These black-box simulations 
create difficulties for both theorists and experimentalists; theorists struggle to set 
values for those parameters, while experimentalists likewise struggle to validate 
models with such a range of potential parameters. 

There is also the question of the accuracy of a given mathematical model, which 
can be difficult to determine without repeated testing of that model’s predictive 
capacity. For example, the motion of objects in space can be described by Newton’s 
laws of motion; however, for very large bodies which are affected by the tug of 
gravity, we must use Einstein’s general relativity to describe that motion. If we turn 
the other way and wish to describe the motions of atoms and particles, then we 
must use quantum mechanics. The intersections of those models, particularly that 
of Einstein’s relativity and quantum mechanics, are far from easy to develop; the 
fabled union of these two theories has occupied physicists for decades and will 
likely continue to do so for some time (Gribbin 1992). 


2.2.3 Computational Models 


The advent of easily-available computing power has revolutionised the process 
of scientific modelling. Previously intractable mathematical models have been 
tractable through the sheer brute-force calculating power of today’s supercomputers. 
Supercomputers now allow physicists to run immense n-body simulations of 
interacting celestial bodies (Cox and Loeb 2007), model the formation of black- 
hole event horizons (Brugmann et al. 2004) and develop complex models of global 
climate and geophysics (Gregory et al. 2005). 

Beyond the sheer number-crunching power of computational methods, the 
flexibility of computational modelling has resulted in the development of new 
varieties of models. While the specific characteristics of each simulated celestial 
body in a model of colliding galaxies is relatively unimportant, given that each of 
those galaxies contains billions of massive bodies with similar gravitational impacts 
on surrounding bodies, certain other phenomena depend on complex individual 
variation to be modelled accurately. Evolution, for example, depends upon the 
development of new species through processes of individual mutation and variation 
mediated by natural selection (Darwin 1859); thus, a model of evolving populations 
requires a description of that individual variation to be effective. 

Take once again our central example. If our hypothetical bird-migration 
researcher hypothesizes that migration behaviour is due to evolutionary factors 
that he may be able to represent in a model, then he must be able to represent the 
effects of biological evolution in some form within that model. If he wishes to 
represent those effects, the digital birds within his model would require individual 
complexity that can produce variation within the simulated species. In contrast, if he 
were modelling only the patterns of the bird movements themselves, then he need 
only represent the impact of those movements on the other agents in the simulation, 
as in the colliding galaxy model above; individual variation in those agents is not 
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such a driving force behind the patterns of bird movements as it would be in an 
evolutionary model of the development of those movements. 


2.2.4 The Science Versus Engineering Distinction 


As computational modelling has developed, so has a distinction between varying 
computational approaches. Some models seek to provide predictive power related 
to a specific physical or natural phenomenon, while others seek to test scientific 
hypotheses related to specific theories. The first type has been characterised as 
modelling for engineering, and the second has been described as modelling for 
science (Di Paolo et al. 2000; Law and Kelton 2000). 

Models for engineering are dependent on empirical data to produce predictions. 
For example, a transportation engineer may wish to examine the most efficient 
means for setting traffic signals (Maher 2007). To determine this, the modeller 
will examine current signal settings, note the average arrival time of vehicles at 
each junction, analyse the anticipated demand for each service, and other similar 
factors. With this information in hand the engineer can produce a model of the 
current operating traffic pathways, and alter parameters of those simulated services 
to attempt to produce an optimum scheduling algorithm for the new signals. 

Similarly, to stretch our bird example to the engineering realm, imagine that 
our migration researcher has decided to model a the dissemination of information 
via messenger pigeon. If he wishes to find an optimum schedule on which to 
release and retrieve these pigeons, he could use an engineering-type model to 
solve this problem. He could examine current and past messenger-pigeon services, 
note the average transit time for the delivery and retrieval of those messages, and 
the level of rest and recuperation needed by each bird. With an understanding of 
these factors, the researcher could develop a model which would provide optimum 
release schedules, given different potential numbers of birds and varying demand 
for message delivery. 

Models for science, in contrast, focus instead on explanation and hypothesis- 
testing. A model of the development of animal signalling by its very nature cannot 
depend on the availability of empirical data; after all, we cannot simply watch 
random evolving populations in the hope that one of them may develop signalling 
behaviours while we wait. Instead, the model is based upon a hypothesis regarding 
the contributing factors that may produce the development of signalling behaviours; 
if the model produces a simulated population which displays those behaviours, then 
the modeller may attribute more validity to that hypothesis. This approach is the 
focus of this text. 

Returning to the bird example, our researcher would be taking a similarly 
scientific modelling perspective if he wished to construct a model which illustrates 
the influence of individual bird movements on migrating flocks. He hypothesizes 
that individual birds within the flock assume controlling roles to drive the timeliness 
of the flock’s migratory behaviour. He could test such a hypothesis by developing a 
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simulated population of migrating birds in which certain individual agents within 
the simulation can affect the movement of multiple other migrating agents; if 
the presence of those controlling agents appears to confirm the necessity of such 
individuals to keep migrations moving in an appropriate timeframe, then he may 
propose that such mechanisms are important to real-world migratory behaviour. 


2.2.5 Connectionism: Scientific Modelling in Psychology 


The advent of this sort of scientific modelling has produced not only new types 
of models, but new fields of enquiry within established fields. The development of 
the connectionist approach to the study of behaviour and cognition provides one 
example of the scientific modelling perspective, and introduces us to the concept of 
emergent explanations. 

The development of computational modelling techniques together with advances 
in neuroscience led some researchers to investigate models of neural function. These 
neural network models consist of simplified neuronal units, with specified activation 
thresholds and means of strengthening or weakening synaptic connections, which 
aim to reproduce the neural basis of behaviours (Rumelhart and McClelland 1986). 
The idea that models of this nature could demonstrate the emergence of cognitive 
behaviour brought with it related ideas concerning the mind that caused some 
controversy. 

In order for the connectionist to assert that their model can represent cognitive 
behaviour, one must assume that mental states correspond to states of activation 
and connection strengths in a given neural network. This concept was derided by 
some who viewed this as an overly reductionist stance, and that in fact the symbolic 
manipulation capability of the mind is crucial to understanding cognitive function 
(Fodor and Pylyshyn 1988). This is related to the perspective espoused by many 
in the field of artificial intelligence, in which the manipulation of symbols was 
considered essential to the development of intelligence (Newell and Simon 1976). 

The connectionist thus forms scientific models which aim to test the hypothesis 
that learning mechanisms in neural networks can lead to the development of 
cognition. While psychologists are certain that the human brain functions via the 
interaction of billions of individual neurons, the degree of correspondence between 
these neural-network models and actual brain function is debatable (see Pinker and 
Mehler 1988 for a damning critique of the unrealistic results of a connectionist 
model of language, as one example). Many models of this type use idealised 
neuronal units to investigate possible explanations of behaviour, such as creating 
non-functional ‘lesion’ areas in a network designed to perform visual search tasks 
as a means of theorising about possible causes of visual deficits (Humphreys et al. 
1992). In this respect, these sorts of connectionist models are designed to test 
hypotheses and develop theories rather than generate predictions based on empirical 
data. 
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2.2.6 Bottom-Up Modelling and Emergence 


As noted above, the controversy surrounding connectionist modelling hinged upon 
one of its base assumptions: the idea that the low-level interaction of individual 
neuronal units could produce high-level complex behaviour. The cognitivists and 
computationalists of psychology found this distressing, as reducing cognition to 
collections of neural activations eliminates the concept of symbolic manipulation 
as a precursor to thought and cognition; the “distributed representation’ concept 
proposed by connectionists would remove the necessity for such higher-level 
concepts of discrete symbol manipulation by the brain (Fodor and Pylyshyn 1988). 

Of course, such perspectives need not necessarily be diametrically opposed. 
One can certainly imagine connectionism forming a useful element of the study 
of cognition, with the cognitivists continuing the study of mental representation 
and symbol manipulation in relation to larger concepts of mental behaviour that 
are less well-suited to the connectionist modelling perspective. Indeed, given 
that connectionist systems can implement symbol-manipulation systems, these 
philosophical differences seem minor (Rowlands 1994). 

However, the idea of this type of ‘bottom-up’ modelling is crucial, and as 
a consequence the debate over this type of modelling bears great relevance to 
our discussion. The view proffered by connectionists that models of low-level 
interacting units can produce the emergence of higher-level complexity is a central 
element of the modelling perspectives being analysed in this text. The particular 
relevance of this controversy when considering problems of validation and scientific 
explanation will be examined further both in this chapter and in Chaps. 6 and 7 in 
particular. 


2.3 Evolutionary Simulation Models and Artificial Life 


2.3.1 Genetic Algorithms and Genetic Programming 


Connectionism was far from the only prominent example of a computational 
innovation taking cues from biology. The use of genetic algorithms, modelled 
on the processes of biological evolution, has a long history in the computational 
sciences. Given the extensive study of natural selection in biological systems as 
an optimisation process, and the need for increasingly innovative optimisation 
techniques within computer science, the use of an analogue of that process for 
computational applications seemed a natural fit. Indeed, as early as the 1950s the 
computers available were being put to use on just these sorts of problems (Fraser 
1957). 

Since these early days of experimentation, the genetic algorithm became an 
established method of optimisation in certain problem spaces. Such algorithms 
seek to encode potential solutions to the problem at hand in forms analogous to 
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a biological ‘genotype’; each solution is then examined to determine its suitability 
as a solution, and evaluated according to a specified fitness function. The most fit 
solutions can then be combined, individual mutations can be generated if desired, 
and the next generation of potential solutions is subjected to the same process. 
Over many generations, the genetic algorithm may find a solution suitable for the 
problem, and in some cases the solution comes in an unexpected and novel form 
due to the influence of this selection pressure. Such systems came into the spotlight 
quite prominently in the 1970s, when John Holland’s book on the topic provided a 
strong introduction (Holland 1975); later works by Goldberg (1989), Fogel (1988), 
and Mitchell (1996) cemented the position of genetic algorithms as a useful method 
for optimisation and search problems. 

Genetic algorithms do suffer from methodological difficulties, of course; certain 
problems are not well-suited to genetic algorithms as a means of finding appropriate 
solutions. In addition, the design of a useful fitness function can be extremely 
difficult, as the programmer must be careful to avoid solutions which cluster around 
local optima (Mitchell 1996). Incorporating an appropriate amount of variation in 
the generated population can be vital for certain applications as well, as the right 
level of random mutation can provide a useful means to escape those local optima. 


2.3.2 Evolutionary Simulations and Artificial Life 


While genetic algorithms became popular amongst certain elements of the computer 
science community, they also drew great interest from those interested in the 
biological function of evolution. As the artificial intelligence community sought to 
model the fundamentals of human intelligence and cognition, others sought to use 
computational methods to examine the fundamentals of life itself. 

The field of artificial life, or ALife, has complex beginnings, but is most often 
attributed to Langton (2006) who first christened the field with this title. ALife 
however has strong links with the artificial intelligence community (Brooks 1991), 
as well as with the earlier modelling traditions of ecology and population biology. 
The influence of the artificial intelligence community, the interest in bottom-up 
modelling as alluded to earlier in our review of connectionism, and the development 
of new techniques to produce adaptive behaviour in computational systems all seem 
to have had a hand in the development of ALife. 

ALife work to date has revolved around a number of related themes, but all of 
them share some method of reproducing the mechanics of biological adaptation in 
computational form. Genetic algorithms as described above are perhaps the most 
prominent example, with a great number of evolutionary simulations using such 
algorithms or some version thereof to provide that element of adaptation. While the 
members of this growing research community moved forward with these methods 
of simulating evolutionary systems, a related set of new challenges faced that 
community. 
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2.3.3 Bedau and the Challenges Facing ALife 


Mark Bedau’s 2003 (Bedau 2003) summary of the field of artificial life provides one 
view of the array of potential challenges facing the ALife researcher. In Bedau’s 
view, ALife clearly displays the potential to enhance our understanding of the 
processes of life, and numerous fundamental questions that spring from those 
processes: 


How does life arise from the non-living? 


1) Generate a molecular proto-organism in vitro. 

2) Achieve the transition to life in an artificial chemistry in silico. 

3) Determine whether fundamentally novel living organisations can arise from inanimate 
matter. 

4) Simulate a unicellular organism over its entire lifecycle. 

5) Explain how rules and symbols are generated from physical dynamics in living systems. 


Bedau begins his extensive list of ALife challenges with a look at the potential 
for these new methods of simulation to simulate the origins of life. Of course there 
is great debate over the best means to simulate such early beginnings. Simulating 
the development of cell structures has been an important theme in ALife (e.g., 
Sasahara and Ikegami 2004), as well as the development of simple self-replicating 
structures (Langton 1990). This is not entirely surprising, given that Von Neumann’s 
self-replicating cellular automaton was clearly an influence on those seeking to 
understand the development of such forms in silico (Von Neumann and Burks 1966). 

However, at what point might we agree that such self-replicating digital organ- 
isms have achieved a ‘transition to life’ as proposed by Bedau? At what point 
does that simulated organism become an instantiation of the laws governing the 
development of natural life? Agreement here is hard to come by; some argue that 
the status of ‘alive’ is best conferred on organisms that can self-reproduce (see Luisi 
(1998) for an evaluation of this and other definitions), while others argue that self- 
motility! is a more important determining factor (Hiroki et al. 2007), and still others 
appeal to the concepts of self-organisation and autopoiesis* (Maturana and Varela 
1973). This issue hinges upon the theoretical perspective of the modeller to a large 
degree: if one believes that the properties of life are just as easily realised in the 
digital substrate as they are in the biological substrate, then an ALife simulation can 
easily achieve life (given an appropriate definition of such) regardless of its inherent 
artificiality. The issue of artificiality in ALife research and its import for the theorist 
and experimentalist are explored in detail in Chap. 3. 


'The ability to move spontaneously or non-reactively. This is considered a vital capability for 
biological life — self-motility allows living things to move in pursuit of food sources, for example. 
See Froese et al. (2014) for a detailed exploration. 

?Autopoietic systems are systems that can produce and sustain themselves through their own 
internal processes, such as the biological cell. The concept was originally described in relation 
to biological systems, but has since been adapted to characterise cognitive and social systems as 
well. 
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What are the potentials and limits of living systems? 


6) Determine what is inevitable in the open-ended evolution of life. 

7) Determine minimal conditions for evolutionary transitions from specific to generic 
response systems. 

8) Create a formal framework for synthesizing dynamical hierarchies at all scales. 

9) Determine the predictability of evolutionary manipulations of organisms and ecosys- 
tems. 

10) Develop a theory of information processing, information flow, and information 

generation for evolving systems. 


Bedau’s next set of challenges are reminiscent of one of Chris Langton’s more 
famous descriptions of artificial life, in which he stated that ALife could seek 
to examine ‘life-as-it-could-be’ rather than simply ‘life-as-we-know-it’ (Langton 
1992). In other words, given that the ALife researcher can construct an enormous 
variety of possible models, and thus living systems if we agree that life can be 
realised in silico, then ALife can be used as a platform to understand the vast variety 
of potential forms that life can create, rather than only examine the forms of life we 
currently perceive in the natural world. 

Bedau is alluding to similar ideas, proposing that ALife researchers can use their 
work to probe the boundaries of the evolution of life. By simulating evolutionary 
systems, he posits that we may be able to investigate the mechanics of evolution 
itself in a way impossible in conventional biology. The researcher is able to freely 
tweak and direct the evolutionary processes at work in his simulation, and if we 
accept that the simulation adequately represents the function of evolution in the real 
world, then such research may allow for a greater understanding of the limits of the 
evolutionary process. 


How is life related to mind, machines and culture? 


11) Demonstrate the emergence of intelligence and mind in an artificial living system. 

12) Evaluate the influence of machines on the next major evolutionary transition of life. 
13) Provide a quantitative model of the interplay between cultural and biological evolution. 
14) Establish ethical principles for artificial life. (Bedau 2003, p. 506) 


Finally, Bedau closes his list of ALife “grand challenges’ with more speculative 
notions of relating the development of life with the development of mind and 
society. He posits that cognition originates from similar roots as life, in that such 
mental activity is a biological adaptation like any other seen in evolving systems, 
and that in this context artificial life may provide insight into the origins of mind as 
well as life. 

The idea that mind and culture follow similar rules of adaptation to life is not 
a new one; the field of evolutionary psychology is well-established, if controversial 
(Buss 2004), and Dawkin’s discussion of the ‘meme’ in relation to cultural evolution 
is one of the more prominent examples of such thinking in sociology (Dawkins 
1995). The question of whether simulation can become sufficiently sophisticated to 
allow for the emergence of these higher-order phenomena is critical to our upcoming 
examination of simulation in the social sciences; in fact, the same philosophical 
difficulties that face modellers of cognition have been linked with similar difficulties 
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in using simulation to model the roots of society and culture (see Sawyer 2002, 
2003, 2004 and the accompanying discussion in Chap. 5). 


2.4 Truth in Simulation: The Validation Problem 


2.4.1 Validation and Verification in Simulation 


While Bedau has provided a useful summary of the possible challenges facing 
artificial life in the research realm, numerous other challenges also loom in the area 
of methodology for ALife modellers. The difficulty of tying simulation results to 
the system being simulated is one not confined to ALife, but instead is common to 
all varieties of simulation endeavour. 

The validation and verification of simulations is often most troublesome for 
modellers of all types. Once a model is designed and run, the researcher must be 
able to express confidence that the results of that model bear a direct relation to 
the system of interest. This is validation, and in relation to computational models 
specifically, Schlesinger’s description of this process as a ‘substantiation that a 
computerized model within its domain of applicability possesses a satisfactory range 
of accuracy consistent with the intended application of the model’ (Schlesinger et al. 
1979) is often cited. In other words, the modeller must demonstrate that the model 
displays a measure of accuracy within the domain to which the model has been 
applied. 

Going hand-in-hand with validation is the concept of verification. A verified 
model is one in which the construction of the model is known to be accurate 
according to the framework in which that model is designed (Law and Kelton 2000). 
In relation to computational models specifically, this means that the model must be 
programmed appropriately to produce results that reflect the intent of the model’s 
construction; given the inherent complexity of the software design process, this is 
not necessarily a simple task to complete. 


2.4.2 The Validation Process in Engineering Simulations 


Validation in simulations for engineering purposes, as described earlier, would 
tend to follow a certain pattern of verifying assumptions and comparing model 
predictions to empirical data (Sargent 1982, 1985). To illustrate these concepts we 
may return to the bird migration example. If our migration researcher wished to 
construct a model which provides an illustration of the migration behaviour of a 
certain bird species, he would first need to discuss the assumptions inherent in the 
conceptual model leading to the simulation’s construction. For example, the speed 
and direction of movement of his simulated migrating populations should match 


2.4 Truth in Simulation: The Validation Problem 27 


those values gleaned from empirical observation; in general, his conceptual model 
should be informed by the available data. This verification of the conceptual model 
provides confidence that the assumptions made to produce the model have a solid 
foundation in theory related to the migrations observed in that bird species. 

Having verified the conceptual model, the researcher would then need to verify 
the model itself. While confidence in the assumptions in the conceptual model has 
been established, the researcher must confirm that these assumptions have been 
implemented appropriately in the model itself. Has the code for the simulation 
been written correctly? Are the correct parameters in place in the model, given the 
parameters required by the conceptual model? If the researcher can demonstrate that 
the simulation has been implemented correctly, then validation can proceed. 

The validation step is the most difficult, requiring a comparison of model data 
with real data. With our model migrating birds in place, the simulation should 
provide predictions of the behaviour of those birds when it is run. Do these 
simulation runs provide data that correlates appropriately with observational studies 
of that bird species’ migration? Does empirical data related to bird migrations 
in general imply that the model’s results are believable? Such questions can be 
complicated to answer, but nevertheless can be answered with access to appropriate 
empirically-collected data (see Law and Kelton 2000 for more discussion). 


2.4.3 Validation in Scientific Simulations: Concepts of Truth 


The procedure outlined above is undoubtedly complex, but achievable for the 
engineer. For the scientist, however, the procedure becomes more complex still. In 
a simulation which is designed to test hypotheses, and in which a clear relation to 
empirical data is not always obvious are indeed possible, the prospect of validating 
the results of that simulation depends on different sorts of relations between theory, 
data and model. 

Appealing to the philosophy of science, Alex Schmid describes three theories of 
truth for simulations in science: the correspondence theory, the consensus theory, 
and the coherence theory (Alex Schmid 2005). The correspondence theory holds 
that the simulation must correspond directly with facts in reality; the consensus 
theory holds that the simulation must be acceptable under idealised conditions; and 
the coherence theory holds that the simulation must form a part of a coherent set of 
theories related to the system of interest. 

The correspondence theory of truth relates closely to the methods of validation 
discussed in relation to engineering: our example bird migration simulation must 
display results that correspond directly to empirical data regarding migrating birds, 
otherwise the predictions of that simulation are of little use. Such a view coincides 
with the prevailing views of validation present in engineering models: validated 
models in the engineering perspective must demonstrate a close relationship to 
known empirical data about the problem under study. 
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The consensus theory, however, is much less defined, depending as it does on 
a more communal evaluation of the truth of the model. Our bird migration model 
may not provide entirely accurate predictions for future migration behaviours, but 
under this view we may still consider the simulation to be validated if the model 
is generally illustrative of bird migration behaviour. A more general version of our 
migration simulation, one developed as an abstract model not tied to any particular 
bird species, could fall into this category. 

The coherence theory moves even further from the concept of truth most 
applicable to engineering, requiring only that the model in question fit into a given 
set of coherent beliefs. If our bird migration model fits cohesively into the general 
body of animal behaviour theory relating to migrating populations, then the model 
may provide a useful and valid addition to that theory. However, as Schmid points 
out, there is no reason that a coherent system of beliefs cannot also be completely 
false (Alex Schmid 2005); a model of our migrating birds travelling across a flat 
planet may be accurate given the belief structures of the Flat Earth Society, but 
is nevertheless completely separate from the truth of birds migrating over a round 
planet. 


2.4.4 Validation in Scientific Models: Kuppers and Lenhard 
Case Study 


Kuppers and Lenhard’s evaluation of validation of simulation in the natural and 
social sciences sought to demonstrate the relation between theory and validation 
in models (Giinter et al. 2005). As a case study, they focused upon the infamous 
scenario in climate modelling of ‘Arakawa’s trick.’ 

Norman Phillis’ model of atmospheric dynamics (Phillips 1956) was a very 
ambitious step toward climate modelling on a grand scale. The results of his 
simulation were viewed with respect by the research community of the time, 
demonstrating as they did a direct correspondence to empirically-observed patterns 
of airflow in the atmosphere, but his results were hampered by an unfortunate 
consequence of the equations: numerical instability prevented the model from 
making any long-term predictions. 

Arakawa sought to solve this problem, and eventually did so by virtue of his 
notable trick: he altered the equations of state, incorporating assumptions that were 
not derived from conventional atmospheric theory, and in the process ensured long- 
term stability in the model. Understandably this technique was met with substantial 
skepticism, but eventually was accepted as further empirical data showed the 
accuracy of Arakawa’s revised model despite the theoretical inadequacies. 

Kuppers and Lenhard use this as a demonstration that ‘performance beats 
theoretical accuracy’ (Giinter et al. 2005). In other words, a simulation can 
provide successful data without having a completely accurate representation of the 
phenomenon at hand. This certainly seems to spell trouble for Schmid’s descriptions 
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of relating truth in simulation to the theoretical background of that simulation. 
If we agree that simulations may achieve empirical accuracy despite theoretical 
inaccuracy, how does this affect our view of truth in simulation? 

A different approach to the relation between model and theory seems required. 
The status of a successful, validated model relies on more than a simple corre- 
spondence between the model and the research community, or the model and its 
surrounding theoretical assumptions, as Arakawa’s computational gambit demon- 
strates. 


2.5 The Connection Between Theory and Simulation 


2.5.1 Simulation as ‘Miniature Theories’ 


The relations described thus far between theory and simulation clearly lack impor- 
tant elements. For example, while the coherence theory of truth is appealing in that, 
say, an evolutionary model may find validation by fitting cohesively into the existing 
theory of biological evolution, how that model relates to theory itself remains an 
open question. Likewise, if a model must fit into an existing set of beliefs, are we 
suddenly restricting the ability of simulation to generate new theory? Is there room 
for innovation in such concepts of validation? 

One means to escape from this difficult connection between simulation and 
theory is to reform our definition completely: we may consider a simulation as a 
theory in itself. Within the simulation community this view is not uncommon: 


The validation problem in simulation is an explicit recognition that simulation models are 
like miniature scientific theories. Each of them is a set of propositions about how a particular 
manufacturing or service system works. As such, the warrant we give for these models can 
be discussed in the same terms that we use in scientific theorizing in general. (Kleindorfer 
et al. 1998, p. 1087) 


Similarly, Colburn describes simulation as a means to ‘test a hypothesis in a 
computer model of reality’ (Colburn 2000, p. 172). In the context of Arakawa’s 
trick, this perspective is attractive: in this view Arakawa’s model can serve as a 
miniature theory of its own, and the perceived disconnect between the assumptions 
of his model and the accepted atmospheric theory are of no consequence to the 
validity of the model. 


2.5.2 Simulations as Theory and Popperian Falsificationism 


If we accept that simulations can take the form of such ‘miniature theories,’ then 
perhaps the question of validation becomes instead a question of the validity of 
scientific theories. Herskovitz suggests that the process of validating simulation 
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models is at its root a Popperian process of falsification (Herskovitz 1991). 
In essence, given that a simulation model is considered validated if its results 
correspond to the behaviour of real-world systems, then a system must likewise 
be falsified if its results do not correspond to the real-world system. 

However, Arakawa’s trick once again throws this view into question. Arakawa’s 
model by its very nature incorporates assumptions that are contrary to physical 
theory: as one example, he assumes that energy is conserved in his model of the 
atmosphere, while the real atmosphere does not display such conservation (Günter 
et al. 2005; Arakawa 1966). In this respect, is his model subject to Popperian 
falsification? If a central assumption of this model, one which informs every 
calculation of that model, is demonstrably false, does his model likewise lose all 
validity? 

Kuppers and Lenhard argue that it does not: that the theory presented by 
Arakawa’s model stands apart from the physical theory upon which it was initially 
based, and its performance speaks more to its validity than the accuracy of its 
conceptual assumptions. Likewise, we might imagine an evolutionary model falling 
victim to the same Popperian plight: assumptions made to simplify the process of 
evolution within the model may contradict observed facts about evolving species 
in nature. However, if those assumptions allow the model to display accuracy in 
another respect, either theoretical or empirical in nature, should we still decry the 
assumptions of that model? 


2.5.3 The Quinean View of Science 


In this context the Popperian view of falsification seems quite at odds with the 
potential nature of scientific models. In contrast to what Herskovitz seems to believe, 
simulations are far more than mere collections of assumptions designed to imitate 
and calculate the properties of natural phenomena. Indeed, simulations can often 
contain a rich backdrop of internal assumptions and theories, and the measure of 
a simulation’s success seems likewise more rich than a simple comparison with 
accepted data and theory. 

The Quinean view of science seems much more suited to the simulation 
endeavour (and indeed, many would argue, more suited to science of all varieties). 
The Duhem-Quine problem stands famously at odds with the Popperian view, 
asserting that scientific theories can in fact never be proved conclusively false 
on their own (Quine 1951, 1975). Given that theories depend on one or more 
(often many) related auxiliary assumptions, theories can be saved from definitive 
falsification by adjusting those auxiliary assumptions. 

For example, Newton’s laws of gravitation were able to explain a great deal 
of natural phenomena, and are still used extensively in modern physics. However, 
Newton’s laws could not explain some clearly evident and anomalous behaviours in 
astronomical bodies: the perihelion of Mercury’s orbit being a prime example. Yet, 
rather than simply disposing of Newton’s theory as inadequate, scientists instead 
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strove for another explanation in addition to Newton’s theories, which later arrived 
in the guise of Einstein’s general relativity. Now Newton’s laws are presented as 
essentially a subset of Einstein’s theory, displaying correct results within a certain 
set of reference frames. Likewise, points where Einstein’s theories break down (i.e., 
at the Big Bang or at the event horizon of a singularity) are not taken as a falsification 
of Einstein’s views, but rather an indication of the need for additional theories and 
assumptions to explain those anomalies in detail. 


2.5.4 Simulation and the Quinean View 


Having rightly discarded the Popperian view of simulation and embraced Quine’s 
notion of flexible and interconnected scientific theories, we can revise our initial 
view of simulation in the context of this understanding. Noble (1998) provides a 
summary of one view of simulation in a Quinean context, specific to artificial life: 
he argues that the Quinean view implies that new models are generated according 
to a requirement to incorporate new information in an existing theory without 
completely reorganising that theory. 

As an example Noble posits that a new simulation in artificial life may seek to 
explain a behaviour in biology as a consequence of an emergent process (Noble 
1998). The modeler may then implement a model incorporating appropriate low- 
level assumptions with the intention of running the simulation to determine whether 
the expected behaviour does indeed emerge. In this respect the model is based 
upon pre-existing conceptual frameworks concerning the high-level behaviour, and 
contributes to the addition of this new behaviour into the overall biological theory 
by providing an explanation of that behaviour in terms of an emergent phenomenon. 

More generally, we may add to this characterisation by referring back to the 
earlier discussion of simulation-as-theory. When constructing a model to allow 
for the integration of new information into an overall conceptual framework, a 
simulation model can function as an auxiliary hypothesis in and of itself: that 
simulation forms a theory, and thus is subject to the same standards as the larger 
conceptual framework. In this case, even if the model does not achieve validation 
in comparison to empirical data, all is not lost; in the appropriate Quinean fashion, 
the auxiliary hypotheses linked to that simulation may be revised to present a new 
version of the simulation-theory (perhaps by revising certain parameter values or 
similar). 

Simulation then is not simply a means to simplify calculations within a pre- 
existing theoretical framework, it is a means to modify that theoretical framework. 
The validity of a simulation is not easy to determine by any means, but a simulation 
based in an existing framework that adds sensible assumptions to that framework 
may go a long way toward justifying its existence as a substantive part of the 


32 2 Simulation and Artificial Life 


overall theory. Unlike in the Popperian view, an invalidated simulation need not be 
discarded, but instead revised; assumptions used in a simulation are pliable, and an 
alteration of same could allow that model to produce insights it originally appeared 
to lack. 


2.6 ALife and Scientific Explanation 


2.6.1 Explanation Through Emergence 


Having established that simulation can perform a valuable role in the development 
of scientific theory, this analysis now turns to the role of simulation in scientific 
explanation specifically. The ability of simulation to provide complete and coherent 
scientific explanation will impact the strength with which this methodology can 
develop scientific theories; bearing this in mind, we require an understanding of the 
limits of simulation in developing explanations. This explanatory role for simulation 
is often hotly debated, particularly in the case of scientific models as described here 
(see the exchange between O’reilly and Farah 1999 and Burton and Young 1999 
for one example, as the authors debate the explanatory coherence of distributed 
representations in psychological models). 

Within ALife, which depends upon frequently abstract simulations of complex 
emergent systems, the explanation problem takes on a new dimension. As Noble 
posits (Noble 1998), ALife models provide the unique mechanism of emergence 
which can provide new elements of an explanation of a phenomenon; however, 
debate continues as to whether explanations of higher-order phenomena through 
emergence can capture a complete explanation of those phenomena. This debate 
is exemplified once more by the debate within cognitive science regarding con- 
nectionism and distributed representations: the reductive character of connectionist 
explanation of mental phenomena is seen as overly restrictive, removing the 
possibility of higher-order explanations of mental states. 

As noted earlier in our discussion of connectionism, this debate is easily 
avoidable in one sense: if we accept that consciousness, for example, is a natural 
emergent property of neuronal activity, then the acceptance of this fact does not 
preclude the use of higher-order discussions of mental states as a means to explain 
the characteristics of that emergent consciousness. This does, however, seem to 
preclude the notion of that emergent explanation being a complete explanation; 
even if one can show that consciousness does indeed emerge from that lower- 
level activity, by the nature of emergent phenomena that consciousness is not easily 
reducible to those component activities. 
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2.6.2 Strong vs Weak Emergence 


The variety of emergence discussed in the previous section is often referred to as 
“strong emergence’: the concept that not only is an emergent phenomenon difficult 
to reduce directly to its component parts, but in fact the emergent phenomenon 
can display downward causation, or supervenience (influencing its own component 
parts), thus making the cause of that emergent phenomenon very difficult to define. 
In essence, the whole is a consequence of its component parts, but is irreducible to 
the actions of those components (see O’Conner 1994; Nagel 1961). 

Weak emergence, by contrast, is a means proposed by Mark Bedau to capture 
the emergent character of natural phenomena without the troublesome irreducibility 
(Bedau 1997). Bedau defines weak emergence thus: 


Macrostate P of S with microdynamic D is weakly emergent iff P can be derived from D 
and S ’s external conditions but only by simulation. (Bedau 1997, p. 6) 


Thus, similar to strong emergence, the macro-level behaviour of the emergent 
system cannot be predicted merely by knowledge of its micro-components. Cru- 
cially however, those macro-level properties can be derived by allowing those 
micro-components to perform their function in simulation. Under strong emergence, 
an evolutionary simulation constructed in bottom-up ALife fashion would be unable 
to capture the complete behaviour of the resultant emergent phenomenon. Under 
weak emergence, that simulation could indeed provide a derivation of that higher- 
level behaviour. 

Bedau takes great pains to point out however that such weakly emergent 
behaviours are still, much like strongly emergent behaviours, essentially 
autonomous from their micro-level components. While in theory one could predict 
exactly the behaviour of a weakly emergent system with a perfectly accurate 
simulation of its micro-level components, in practice such simulations will be 
impossible to achieve. Instead, the micro-level explanation via simulation provides 
a means to observe the general properties of the macro-level weakly emergent 
result. 

In this sense there appears to be a certain circularity to weak emergence: 
simulation can provide a micro-level explanation of an empirical phenomena, but 
in practice ‘we can formulate and investigate the basic principles of weak emergent 
phenomena only by empirically observing them at the macro-level’ (Bedau (1997), 
p. 25). Some may argue in fact that this constitutes an explanation in only a weak 
sense: one could point to a simulation of this type and note that the given micro- 
components lead to the specified macro-behaviour, but the level of insight into that 
macro-behaviour is still fundamentally limited despite intimate knowledge of the 
behaviour of its components. 

This objection becomes important once more in the context of social simulation 
and the concept of non-reductive individualism, which is explored in Chaps. 5, 6 
and 7. While Bedau’s concept of weak emergence is less metaphysically tricky 
than classical strong emergence, the difficulties that remain in explanation by 
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simulation despite this new categorisation of phenomena still allow for criticism 
of the simulation approach. Such criticisms will inform our discussion of social 
simulation in the second section of this text as well as our discussion of Alife in the 
current section. 


2.6.3 Simulation as Thought Experiment 


Unsurprisingly these difficulties in using simulation for scientific explanation have 
generated much discussion within the research community. Di Paolo, Noble and 
Bullock approach this thorny issue by proposing that simulations are best viewed 
as opaque thought experiments (Di Paolo et al. 2000). This proposal draws upon 
Bedau’s earlier description of ALife models, describing them as ‘computational 
thought experiments.’ 

A traditional thought experiment in this view constitutes ‘in itself an explanation 
of its own conclusion and its implication’ (Di Paolo et al. 2000, p. 6). In 
other words a thought experiment provides a self-contained means with which to 
probe the boundaries of the theory which informs that experiment. A successful 
thought experiment can provoke a reorganisation of an existing theory as it brings 
previously-known elements of that theory into a novel focus. 

Simulation experiments, it is argued, can fulfill a similar purpose. However, 
simulations suffer from an inherent opacity: as noted in our discussion of emer- 
gence, the modeler’s knowledge of the workings of the simulation do not imply 
an understanding of the simulation’s results. Unlike in a conventional thought 
experiment, the modeler must spend time unraveling the result of his simulation, 
probing the consequences to determine the implications for theory. 

As a result of this view, the authors propose a different methodology in simu- 
lation research than the conventional view. Firstly, they contend that the successful 
replication of a result given some mechanism described in the simulation does not 
constitute an explanation (a misconception common to simulation work, and clearly 
debunked by the characteristics of emergence mentioned earlier). In consequence 
the explanation which may be drawn from simulation work is likely to incorporate 
an ‘explanatory organisation,’ in which some elements of the problem are explained 
through micro-level behaviour, others may be explained at the macro-level, and still 
others in relations between the two. 

In essence, they advocate an additional step in the modelling process in which 
the modeler performs experiments on the simulation, as one might do in a laboratory 
environment. The systematic exploration of the model itself is intended to provide 
a greater understanding of its inner workings, and in turn this theory of the model’s 
behaviour must then be related to theories about the natural world which provide 
the inspiration for the model. So the ALife researcher can accept the view that 
emergent behaviours are difficult to explain via simulation, but at the same time 
forming theories about the simulation that relate to theories about those behaviours 
may produce a new insight into pre-existing theories, as with a successful thought 
experiment. 
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2.6.4 Explanation Compared: Simulations vs Mathematical 
Models 


Taking the thought-experiment perspective into more depth, Bryden and Noble 
(2006) contrast the explanatory capacity of simulation models with that of math- 
ematical models. They seek to explore what is required of an explanation derived 
from simulation, noting that the unfortunately commonly accepted view that a 
simple qualitative similarity between the simulation result and the behaviour of the 
real system is sufficient to provide any sort of explanation. 

Bryden and Noble note another element of the inherent opacity of simulation 
research: the analytical incompleteness of such models. Mathematical treatments, 
when flawed, are easily revealed as such. Simulations in contrast can be run many 
times, with different parameter values, and flaws in the coding of the simulation 
may not be immediately apparent. Similarly, those simulation runs represent only 
isolated data points in the entire possible space of runs allowable in that model; since 
no researcher can spare the time to browse the entire space of parameter values for 
a simulation, the results we see are only a fraction of what is possible. 

The authors go on to advocate a means for decomposing a simulation system 
into component mechanistic subsystems which are more amenable to mathematical 
explanation. A working model which is decomposed in this way may still not 
provide the complete analytical package of an exclusively mathematical treatment, 
but it is argued that this brings the researcher closer to a full analytical solution of 
the target system. Thus the computational model is seen as a means to generate the 
tools necessary to reach a cohesive mathematical explanation of the phenomenon 
under study. 

In a broader context this approach is quite close to that proposed by Di Paolo, 
Noble and Bullock (Di Paolo et al. 2000). In both cases the modeler spends time 
exploring the confines of the model in question, probing its inner workings to define 
the parameters in which that model operates. Having done this, the modeler may 
begin to relate those theories about the model to theories about the world; in Bryden 
and Noble’s view, for maximum explanatory power those relations should take the 
form of mathematical treatments. From both however we draw the conclusion that 
models which are able to replicate an emergent behaviour through the simulation 
of a system’s micro-level component interactions is still very far from providing 
an explanation of that system. The simulation itself must be further deconstructed, 
its parameters understood, and its workings probed in order to relate that model 
successfully to the real system to which it relates. Thus simulation becomes a 
valuable tool in the scientist’s repertoire, but one that must be supplemented by 
more traditional means of enquiry as well. 
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2.7 Summary and Conclusions 


The problems facing the simulation methodology are clearly far from philosoph- 
ically transparent. From mathematical models to evolutionary simulations, these 
tools display potential explanatory power and are quick to develop in comparison 
to traditional empirical studies. However, with that relative ease of use comes a 
correspondingly high difficulty of analysis. 

The type of simulation discussed here, particularly in the context of ALife, 
focuses on the investigation of emergent phenomena in the natural world. These 
phenomena by their very nature are difficult to explain; simulation provides a means 
to view the origin of these phenomena from the behaviour of low-level component 
parts, but still lacks in its ability to explain the overall behaviour of that higher-level 
emergent order. 

A number of researchers and philosophers have attempted to justify the use of 
simulation as a means for scientific explanation in a variety of ways; our synthesis up 
to this point indicates that simulation is certainly a useful element in the explanatory 
toolbox. Simulation however cannot stand alone: a simulation which displays an 
emergent behaviour still requires a theoretical framework to describe that behaviour. 

Within ALife, different streams of thought have developed in response to these 
difficulties; questions surrounding the appropriate use of simulation in the field have 
led to extensive debate on the validity of simulation as a means to generate data. A 
further investigation into the theoretical underpinnings of ALife in the following 
chapter will provide insight into the fundamental aspects of this debate, and lead 
us further into this analysis of simulation methodology. This investigation will also 
provide important theoretical background for future discussion of Alife models and 
their relation to more general methodological concerns in modelling amongst the 
broader biology community. 
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Chapter 3 
Making the Artificial Real 


3.1 Overview 


Having established in Chaps. | and 2 a working understanding of the philosophical 
underpinnings of computer simulation across disciplines, we turn now to a relatively 
new field which has created a stir throughout the computer science community to 
investigate questions of artificiality and its effect upon this type of inquiry. Can a 
simulation create novel datasets which allow us to discover new things about natural 
systems, or are simulations based on the natural world destined to be mere facsimiles 
of the systems that inspire them? 

For a possible answer we turn to Artificial Life, a field which jumped to 
prominence in the late 1980s and early 1990s following the first international 
conferences on the topic. Proponents of ‘Strong’ Alife argue that this new science 
provides a means for studying ‘life-as-it-could-be’ through creating digital systems 
that are nevertheless alive (Langton et al. 1989; Ray 1994). 

Of course, such a bold claim invites healthy skepticism, so in this chapter we 
will investigate the difficulties with strong Alife. By discussing first the nature 
of artificiality in science and simulation, then investigating related theoretical and 
methodological frameworks in Artificial Intelligence and more traditional sciences, 
we attempt to uncover a way in which the strong Alife community might justify 
their field as a bona fide method for producing novel living systems. 

This discussion bears significant importance in the further discussion to come. 
Providing a theoretical background for any given simulation can greatly impact 
the modeller’s ability to link a simulation to conventional empirical science, and 
also serves to illuminate the assumptions inherent in the model and their possible 
impacts. These issues will be discussed further in the following chapter, in which 
this first section of the text incorporates broader concerns regarding modelling from 
the biological community in preparation for creating a framework that incorporates 
social science in the second section. 
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3.2 Strong vs. Weak Alife and AI 
3.2.1 Strong vs. Weak AI: Creating Intelligence 


The AI community has often been characterised as encompassing two separate 
strands of research: Strong AI, and Weak AI. Strong AI aims to develop computer 
programmes or devices which exhibit true intelligence; these machines would be 
aware and sentient in the same way as a human being. Weak AI, in contrast, aims to 
develop systems which display a facsimile of intelligence; these researchers do not 
attempt to create an intelligent being electronically, but instead a digital presence 
which displays the abilities and advantages of an intelligent being, such as natural 
language comprehension and flexible problem-solving skills. 


3.2.2 Strong vs. Weak Alife: Creating Life? 


The title of the first international artificial life conference immediately identified 
two streams of Alife research: the synthesis of living systems as opposed to their 
simulation. Perhaps designed to echo the distinction made between strong and weak 
artificial intelligence, this division has been readily adapted by the Alife community 
in the following years. While strong Alife attempts to create real organisms in a 
digital substrate, the weak strand of Alife instead attempts to emulate certain aspects 
of life (such as adaptability, evolution, and group behaviours) in order to improve 
our understanding of natural living systems. 


3.2.3 Defining Life and Mind 


Both intelligence and life suffer from inherent difficulties in formalisation; debate 
has raged in psychology and biology about what factors might constitute intelligent 
or living beings. AI has attempted to define intelligence in innumerable ways since 
its inception, resulting in such notable concepts as the Turing Test, in which a 
computer programme or device which can fool a blind observer into believing that 
it is human is judged to be genuinely intelligent (Turing 1950). 

Meanwhile, prominent philosophers of mind have pointed out the flaws in such 
arguments, as exemplified in the oft-cited Chinese Room dilemma posed by Searle 
(1980). In this thought experiment, an individual is locked inside a room with 
an encyclopaedic volume delineating a set of rules allowing for perfect discourse 
in Chinese. This individual engages in a dialogue with a native Chinese speaker 
outside the room merely by following his rulebook. Searle contends that since 
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the speaker himself does not have any innate knowledge of Chinese, his ability to 
speak Chinese fluently and convincingly through his rulebook does not prove his 
intelligence. Counter-arguments to Searle’s Chinese Room have attempted to evade 
this apparent conundrum by claiming that, while the individual inside the room has 
no comprehension of Chinese, the entire system (encompassing the room, rulebook, 
and individual) does have an intelligent comprehension of Chinese, thus making 
the system an intelligent whole; Searle of course offered his own rejoinder to this 
idea (Searle 1980, 1982), though the controversy continues with connectionists and 
other philosophers continuing to weigh in with their own analyses (Churchland and 
Churchland 1990; Harnad 2005). 

Similarly life as a phenomenon is perhaps equally difficult to define. While most 
researchers in relevant fields agree that living systems must be able to reproduce 
independently and display self-directed behaviour, this definition can fall apart 
when one is presented with exceptional organisms such as viruses, which display 
that reproductive component but consist of a bare minimum of materials necessary 
to for that action. Are such organisms still alive, or are they no more than self- 
reproducing protein strands? Alternatively, are even those protein strands in some 
way ‘alive’? Researchers and philosophers alike continue to dispute the particulars. 
Alife researchers’ claims that they can produce ‘real; digital life become more 
problematic in this regard, as under such a nebulous framework for what constitutes 
life, those researchers have quite a lot of latitude under which to make such 
statements. 


3.3 Levels of Artificiality 


3.3.1 The Need for Definitions of Artificiality 


The root of the problem of strong Alife is hinted at by the suggestion of unreality 
or falsity that can be connoted by the terms thus far used to characterise Alife: 
synthetic or simulated. A clarification of the type of artificiality under consideration 
could provide a more coherent picture of the type of systems under examination 
in this type of inquiry, rather than leaving Alife researchers mired in a morass of 
ill-defined terminology. 

Silverman and Bullock (2004) outline a simple two-part definition of the term 
‘artificial,’ each intended to illuminate the disparity between the natural system and 
the artificial system under consideration. First, the word artificial can be used to 
denote a man-made example of something natural (hereafter denoted Artificial!). 
Second, the word can be used to describe something that has been designed to 
closely resemble something else (hereafter denoted Artificial’). 
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3.3.2 Artificial’: Examples and Analysis 


Artifical! systems are frequently apparent in everyday reality. For example, artificial 
light sources produce real light which consists of photons in exactly the same 
manner as natural light sources, but that light is indeed manufactured rather than 
being produced by the sun or bioluminescence. A major advantage for the ‘strong 
artificial light’ researcher is that our current scientific understanding provides that 
researcher with a physical theory which allows us to combine phenomena such as 
sunlight, firelight, light-bulb light, and other forms of light into the singular category 
of real light. 

Brian Keeley’s example of artificial flavourings (Keeley 1997) shows the lim- 
itations of this definition. While an artificial strawberry flavouring might produce 
sensations in human tastebuds which are indistinguishable from real strawberry 
flavouring, this artificially-produced compound (which we shall assume has a 
different molecular structure from the natural compound) not only originates from 
a different source than the natural compound, but is also a different compound 
altogether. In this case, while initially appearing to be a real instance of strawberry 
flavouring, one can make a convincing argument for the inherent artificiality of the 
manufactured strawberry flavour. 


3.3.3 Artificial’: Examples and Analysis 


Artificial? systems, those designed to closely resemble something else, are similarly 
plentiful, but soon show the inherent difficulties of relating natural and artificial 
systems in this context. Returning to the artificial light example, we could imagine 
an Artificial? system which attempts to investigate the properties of light without 
producing light itself; perhaps by constructing a computational model of optical 
apparatus, for example, or developing means for replicating the effects of light 
upon a room using certain architectural and design mechanisms. In this case, our 
Artificial? system would allow us learn about how light works and why it appears 
to our senses in the ways that it does, but it would not produce real light as in an 
Artificial! system. 

Returning to the case of Keeley’s strawberry flavouring, we can place the man- 
ufactured strawberry flavouring more comfortably into the category of Artificial’. 
While the compound is inherently useful in that it may provide a great deal of insight 
into the chemical and biological factors that produce a sensation of strawberry 
flavour, the compound itself is demonstrably different from the natural flavouring, 
and therefore cannot be used as a replacement for studying the natural flavouring. 
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3.3.4 Keeley’s Relationships Between Entities 


In an effort to clarify these complex relationships between natural and artificial 
systems, Brian Keeley (1997) describes three fundamental ways in which natural 
and artificial entities can be related: 


... (1) entities can be genetically related, that is, they can share a common origin, (2) entities 
can be functionally related in that they share properties when described at some level of 
abstraction, and (3) entities can be compositionally related; that is, they can be made of 
similar parts constructed in similar ways. (Keeley 1997, p. 3, original emphasis) 


This description seems to make the Alife researcher’s position even more 
intractable. The first category seems highly improbable as a potential relationship 
between natural systems and Alife, given that natural life and digital life cannot 
share genetic origins. The third category is more useful in the field of robotics per- 
haps, in which entities could conceivably be constructed which are compositionally 
similar, or perhaps even identical, to biological systems. The second category, as 
Keeley notes, seems most important to Alife simulation; establishing a functional 
relationship between natural life and Alife seems crucial to the acceptance of Alife 
as empirical enquiry. 


3.4 ‘Real’ AI: Embodiment and Real-World Functionality 


3.4.1 Rodney Brooks and ‘Intelligence Without Reason’ 


Rodney Brooks began a movement in robotics research toward a new methodology 
for robot construction with his landmark paper ‘Intelligence Without Reason’ 
(Brooks 1991). He advocated a shift towards a focus on embodied systems, or 
systems that function directly in a complex environment, as opposed to systems 
designed to emulate intelligent behaviours on a ‘higher’ level. Further, he posited 
that such embodiment could produce seemingly intelligent behaviour without high- 
level control structures at all; mobile robots, for example, could use simple rules to 
walk which when combined with the complexities of real-world environments may 
produce remarkably adaptable walking behaviours. 

For Brooks and his contemporaries, the environment is not something to be 
abstracted away in an AI task, but something which must be dealt with directly 
and efficiently. In a manner somewhat analogous to the Turing Test, AI systems 
must in a sense ‘prove’ their intelligence not through success in the digital realm 
but through accomplishments rooted in real-world endeavour. 
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3.4.2 Real-World Functionality in Vision and Cognitive 
Research 


While embodiment may appear to be less of a concern in AI research related to 
vision and cognition, as these behaviours can be separated from the embodied 
organism more readily, such research is often still rooted in real-world environ- 
ments. Winograd’s well-known SHRDLU program (Winograd 1972) could answer 
questions addressed in natural language that related to a ‘block world’ that it was 
able to manipulate; the program could demonstrate knowledge of the properties of 
this world and the relationships of the various blocks to one another. While the 
‘block world’ itself was an artificial construct, the success of SHRDLU was based 
upon its ability to interact with the human experimenter about its knowledge of that 
virtual world, rather than just its ability to manipulate the virtual blocks and function 
within that world. 

Computer vision researchers follow a similar pattern to the natural-language 
community, focusing on systems which can display a marked degree of competence 
while encountering realistic visual stimuli. Objection recognition is a frequent 
example of such a problem, testing a system’s ability to perceive and recognize 
images culled from real-world stimuli, such as recognising moving objects in team- 
sports footage (Bennett et al. 2004). Similarly, the current popularity of CCTV 
systems has lead to great interest in cognitive systems which can analyse the wealth 
of footage provided from many cameras simultaneously (Dee and Hogg 2006). 
In the extreme, modern humanoid roboticists must integrate ideas from multiple 
disciplines related to human behaviour and physiology as well as AI in order to 
make these constructions viable in a real-world environment (Brooks et al. 1999). 

Within the AI research community, the enduring legacy of the Turing Test 
has created a research environment in which real-world performance must be the 
end goal; merely understanding basic problems or dealing exclusively in idealised 
versions of real-world situations are not sufficient to prove a system’s capacity 
for intelligent behaviour. The question of artificiality is less important than that of 
practicality; as in engineering disciplines, a functional system is the end goal. 


3.4.3 The Differing Goals of AI and Alife: Real-World 
Constraints 


Clearly, real-world constraints are vital to the success of most research endeavours 
in AI. Intelligent systems, Strong or Weak, must be able to distinguish and respond 
to physical, visual, or linguistic stimuli, among others, in a manner sufficient to 
provide a useful real-world response. Without this, such systems would be markedly 
inferior next to the notable processing abilities of even the most rudimentary 
mammalian brain. 
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For Alife, however, the landscape is far more muddled. Most simulations take 
place in an idealised virtual context, with a minimum of complex and interacting 
factors, in the hopes of isolating or displaying certain key properties in a clear 
fashion. Given the disparity between the biological and digital substrates, and 
the difficulties in defining life itself, the relationship between the idealised virtual 
context of the simulated organisms and any comparable biological organism seems 
quite wide. 

In this sense, artificial intelligence has a great advantage, as ‘human’ intelligence 
and reasoning is a property of a certain subset of living beings, but can be viewed 
as a property apart from the biological nature of those living beings to a degree. 
Artificial life, by its very nature, attempts to investigate properties which depend 
upon that living substrate in a much more direct fashion. 


3.5 ‘Real’ Alife: Langton and the Information Ecology 


3.5.1 Early Alife Work and Justifications for Research 


The beginnings of Alife research stemmed from a number of examinations into 
the properties of life using a series of relatively recent computational methods. 
Genetic algorithms, which allow programs to follow a process of evolution in 
line with a defined ‘fitness function’ to produce more capable programs, were 
applied to digital organisms which attempted to compete and reproduce rather than 
simply solve a task (Ray 1994). Similarly, cellular automata displayed remarkable 
complexity despite being derived from very simple rules; such a concept of 
‘emergent behaviour’ came to underwrite much of Alife in the years to come. 
Creating simulations which display life-like behaviours using only simple rule-sets 
seemed a powerful metaphor for the complexity of life deriving from the interactions 
of genes and proteins. 


3.5.2 Ray and Langton: Creating Digital Life? 


Ray (1994) and Langton (1992) were early proponents of the Strong Alife view. 
Ray contended that his Tierra simulation, in which small programs competed for 
memory space in a virtual CPU, displayed an incredible array of varied ‘species’ of 
digital organisms. He further posited that such simulations might hail the beginnings 
of a new field of digital biology, in which the study of digital organisms may teach 
researchers about new properties of life that may be difficult to study in a natural 
context; he argued that his digital biosphere was performing fundamentally the same 
functions as the natural biosphere: 
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Organic life is viewed as utilising energy, mostly derived from the Sun, to organize 
matter. By analogy, digital life can be viewed as using CPU (central processing unit) 
time, to organize memory. Organic life evolves through natural selection as individuals 
compete for resources (light, food, space, etc.) such that genotypes which leave the 
most descendants increase in frequency. Digital life evolves through the same process, as 
replicating algorithms compete for CPU time and memory space, and organisms evolve 
strategies to exploit one another. (Ray 1996, p. 373-4) 


For Ray, then, an environment providing limited resources as a mechanism 
for driving natural selection and an open-ended evolutionary process is sufficient 
to produce ‘increasing diversity and complexity in a parallel to the Cambrian 
explosion.’ He goes on to describe the potential utility of such artificial worlds 
for a new variety of synthetic biology, comparing these new digital forms of ‘real’ 
artificial life to established biological life. 

Langton (1992), in his investigation of cellular automata, takes this idea one 
step further, describing how ‘hardware’ computer systems may be designed to 
achieve the same dynamical behaviours as biological ‘wetware.’ He posits that 
properly organised synthetic systems can provide these same seemingly unattainable 
properties (such as life and intelligence), given that each of these types of systems 
can exhibit similar dynamics: 


...1f it is properly understood that hardness, wetness, or gaseousness are properties of the 
organization of matter, rather than properties of the matter itself, then it is only a matter of 
organization to turn ‘hardware’ into ‘wetware’ and, ultimately, for ‘hardware’ to achieve 
everything that has been achieved by wetware, and more. (Langton 1992, p. 84) 


For Langton, life is a dynamical system which strives to avoid stagnation, 
proceeding in a series of phase transitions from one higher-level evolved state to the 
next. This property can be investigated and replicated in computational form, which 
in his view seems to provide sufficient potential for hardware to develop identical 
properties to wetware in the appropriate conditions (such as an appropriately- 
designed cellular automata space). 


3.5.3 Langton’s Information Ecology 


Langton (1992) and Langton et al. (1989) attempted to justify his views regarding 
artificial life by proposing a new definition for biological life. Given that natural life 
depends upon the exchange and modification of genetic information through natural 
selection, Langton suggests that these dynamics of information exchange are in fact 
the essential components of life: 


.. -in living systems, a dynamics of information has gained control over the dynamics of 
energy, which determines the behavior of most non- living systems. (Langton 1992, p. 41) 


Thus, the comparatively simple thermodynamically regulated behaviours of non- 
living systems give way to living systems regulated by the dynamics of gene 
exchange. From this premise, Langton proposes that if this ‘information ecology’ 
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were accepted as the defining conditions for life, then a computer simulation could 
conceivably create an information ecology displaying the same properties. 


3.6 Toward a Framework for Empirical Alife 


3.6.1 A Framework for Empirical Science in AI 


If we seek the construction of a theoretical framework to underwrite empirical 
exploration in Alife, we can gather inspiration from Newell and Simon’s seminal 
lecture (Newell and Simon 1976). The authors sought to establish AI as a potential 
means for the empirical examination of intelligence and its origins. They argue that 
computer science is fundamentally an empirical pursuit: 
Computer science is an empirical discipline.... Each new machine that is built is an 
experiment. Actually constructing the machine poses a question to nature; and we listen 


for the answer by observing the machine in operation and analyzing it by all analytical and 
measurement means available. (Newell and Simon 1976, p. 114) 


However, the argument that computer science is fundamentally empirical due to 
its experimental interactions with replicable, physical systems is not sufficient to 
claim that an AI can be used as a means to study intelligence empirically. Newell 
and Simon address this by proposing a definition of a physical symbol system, or 
‘a machine that produces through time an evolving collection of symbol structures’ 
(Newell and Simon 1976, p. 116). The details of the definition and its full import are 
beyond the scope of this text, but in essence, the authors argue that physical symbol 
systems are capable of exhibiting ‘general intelligent action’ (Newell and Simon 
1976, p. 116), and further, that studying any generally intelligent system will prove 
that it is, in fact, a physical symbol system. 

Newell and Simon then suggest that physical symbol systems can, by definition, 
be replicated by a universal computer. This leads us to their famous Physical Symbol 
System Hypothesis — or PSS Hypothesis — which we can summarise as follows: 


1. ‘A Physical Symbol System has the necessary and sufficient means for general 
intelligent action.’ (Newell and Simon 1976, p. 116) 
2. A computer is capable of replicating a Physical Symbol System. 


Thus, by establishing general intelligence as a process of manipulating symbols 
and symbol expressions, and that computers are capable of replicating and perform- 
ing identical functions — and indeed are quite good at doing so — Newell and Simon 
present AI as an empirical study of real, physical systems capable of intelligence. 
AI researchers are not merely manipulating software for the sake of curiosity, but 
are developing real examples of intelligent systems following the same principles 
as biological intelligence. 
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3.6.2 Newell and Simon Lead the Way 


Newell and Simon’s PSS Hypothesis (Newell and Simon 1976) largely succeeded 
in providing a framework for AI researchers at the time, and one that continues to be 
referenced and used today. Such a framework is notably absent in Alife, however, 
given the difficulties in both defining Alife and in specifying a unified field of Alife, 
which necessarily spans quite a few methodologies and theoretical backgrounds. 
With Langton’s extension of the fledgling field of Alife into the study of ‘life-as- 
it-could-be,’ a more unified theoretical approach seems vital to understanding the 
relationship between natural and artificial life: 


Artificial life is the study of artificial systems that exhibit behavior characteristic of natural 
living systems. It is the quest to explain life in any of its possible manifestations, without 
restriction to the particular examples that have evolved on Earth. This includes biological 
and chemical experiments, computer simulations, and purely theoretical endeavors. Pro- 
cesses occurring on molecular, social and evolutionary scales are subject to investigation. 
The ultimate goal is to extract the logical form of living systems. (Langton, announcement 
of Artificial Life: First International Conference on the Simulation and Synthesis of Living 
Systems) 


By likening Alife to the study of the very nature of living systems, Langton 
appeals to the apparent flexibility and power of computer simulations. The sim- 
ulation designer has the opportunity to create artificial worlds that run orders of 
magnitude faster than our own and watch thousands of generations of evolution pass 
in a short space of time, and thus seems to provide an unprecedented opportunity 
to observe the grand machinery of life in a manner that is impossible in traditional 
biology. 

This, in turn, leads to an enticing prospect: can such broad-stroke simulations be 
used to answer pressing empirical questions about natural living systems? Bedau 
(1998), for example, sees a role for Alife in a thought experiment originally 
proposed by Gould (1989). Gould asks what might happen if we were able to 
rewind evolutionary history to a point preceding the first developments of terrestrial 
life. If we changed some of those initial conditions, perhaps merely by interfering 
slightly with the ‘primordial soup’ of self-replicating molecules, what would we see 
upon returning to our own time? Gould suggests that while we might very well see 
organisms much the same as ourselves, there is no reason to assume that this would 
be the case; we may resume life in our usual time frame to discover that evolutionary 
history has completely rearranged itself as a result of these manipulations. 

For Bedau, this thought experiment presents an opening for Alife to settle a 
question fundamentally closed to traditional biology. By constructing a suitable 
simulation which replicates the most important elements of biological life and 
the evolutionary process, and running through numerous simulations based on 
variously-perturbed primordial soups, we could observe the resultant artificial 
organisms and see for ourselves the level of diversity which results. 

Other authors have proposed Alife simulations that fall along similar lines 
(Bonabeau and Theraulaz 1994; Ray 1994; Miller 1995), noting that evolutionary 
biologists are burdened with a paucity of evidence with which to reconstruct the 
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evolutionary course of life on Earth. The fossil record is notoriously incomplete, 
and our vanishingly small time on Earth has allowed precious few opportunities 
to observe even the species which exist around us today. In addition, as Gould’s 
thought experiment highlights, we only have the opportunity to observe those 
organisms which have evolved on Earth, leaving us entirely uncertain of which 
properties we observe in that life are particular to life on Earth, and which are 
particular to life in any form. 

Quite obviously such an experiment could be potentially revolutionary, and yet 
the methodological problems are vast despite the inherent flexibility of computer 
simulation. How could we possibly confirm that these artificial organisms are 
Artificial! rather than Artificial?? Given our lack of a definition for biological life, 
determining whether a digital organism is alive seems a matter of guesswork at best. 
Further, how detailed must the simulation be to be considered a reasonable facsimile 
of real-world evolutionary dynamics? Should they select on the level of individuals, 
or genes, or perhaps even artificial molecules of some sort? If by some measure we 
determine the simulation to be Artificial? rather than Artificial!, can we truly claim 
that this simulation in any way represents the development of natural living systems, 
or is it merely a fanciful exploration of poorly-understood evolutionary dynamics? 


3.6.3 Theory-Dependence in Empirical Science 


Beyond these methodological considerations, some troubling philosophical issues 
appear when considering such large-scale applications of the Alife approach. Some 
have argued that Alife simulations should not, and in fact cannot, be considered 
useful sources of empirical data given that they are loaded with inherent biases from 
their creators (Di Paolo et al. 2000). Simulations must necessarily adopt varying 
levels of abstraction in order to produce simulations which are both computable and 
analysable; after all, simulations which replicate the complexities of real biology in 
exacting detail would gain the experimenter very little time-saving and eliminate 
one of the major benefits of computational modelling. These abstractions are 
tacitly influenced by the experimenter’s own biases, relying upon the simulation 
creator’s perception of which aspects of the biological system can be simplified or 
even removed from the simulation entirely. Similarly, the parameter settings for 
that simulation will come with their own biases, resulting in simulations which 
could produce vastly different results depending on the theoretical leanings of the 
programmer. 

While one might claim that conventional empirical science suffers from very 
similar shortfalls, Chalmers points out that biases within such fields must necessarily 
end when the results of an experiment contradict those biases: 

... however informed by theory an experiment is, there is a strong sense in which the results 


of an experiment are determined by the world and not by theories... we cannot make [the] 
outcomes conform to our theories. (Chalmers 1999, p. 39-40) 


50 3 Making the Artificial Real 


In the case of computer simulations this conclusion seems more difficult. After 
all, we are in essence adding an additional layer — in addition to the two layers of 
the physical world and the theory that describes it, we now have a model layer as 
well, which approximates the world as described by our theory. Clearly this adds 
additional complications: when not only the experiment but the very world in which 
that experiment takes place are designed by that biased experimenter, how can one 
be sure that pre-existing theoretical biases have not completely contaminated the 
simulation in question? 

To return to our central bird migration example, one can imagine various ways in 
which our migration researcher could introduce theoretical bias into the simulation. 
That researcher’s ideas regarding the progression of bird migration, the importance 
of individual behaviours or environmental factors in migration, and other pre- 
existing theoretical frameworks in use by the researcher will inform the assumptions 
made during construction of the model. In such a scenario, removing such biases is 
difficult for the modeller, as on some level the model requires certain assumptions 
to function; the researcher needs to develop a theoretical framework in which this 
artificial data remains usable despite these issues. 


3.6.4 Artificial Data in Empirical Science 


An examination of more orthodox methods in empirical science may shed some 
light on how some methodologies use artificially-generated data to answer empirical 
questions. While these methods are more based in the natural world than an Alife 
simulation, they still rely upon creating a sort of artificial world in which to examine 
specific elements of a process or phenomenon. 

Despite the artificiality of this generated data, such disciplines have achieved a 
high degree of acceptance amongst the research community. A brief look at some 
examples of the use of such data in empirical research will provide some insight 
into possible means for using artificial data derived from simulation. 


3.6.4.1 Trans-Cranial Magnetic Stimulation 


Research into brain function often employs patients who have suffered brain damage 
through strokes or head injuries. Brains are examined to determine which areas are 
damaged, and associations between these damaged areas and the functional deficits 
exhibited by the patients are postulated. The technique of trans-cranial magnetic 
stimulation, known as TCMS or TMS, has allowed psychology researchers to extend 
the scope of this approach by generating temporary artificial strokes in otherwise 
healthy individuals. 

TMS machinery produces this effect through a set of electrodes which are 
placed on the outside of the subject’s skull. The researcher first maps the subject’s 
brain using an MRI scan, and then begins using the TMS apparatus to fire brief 
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magnetic pulses at the surface of the brain. These pulses in effect overstimulate the 
affected neurons, causing small areas of the brain to ‘short-circuit, which replicates 
the effects of certain types of brain injury. TMS researchers have managed to 
replicate the effects of certain types of seizure (Fujiki and Steward 1997), and have 
examined the effects of stimulation of the occipital cortex in patients with early- 
onset blindness (Kujala et al. 2000). TMS has even produced anomalous emotional 
responses; after inhibition of the prefrontal cortex via TMS, visual stimuli that might 
normally trigger a negative response were much more likely to cause a positive 
response (Padberg 2001). 

Such methods provide a way for neuroscientists and psychologists to circumvent 
the lack of sufficient data from lesion studies. Much of cognitive neuropsychology 
suffers from this paucity of raw data, forcing researchers to spend inordinate 
amounts of time searching through lengthy hospital records and medical journals 
for patients suffering from potentially appropriate brain injuries. Even when such 
subjects are found, normal hospital testing may not have revealed the true nature of 
the subject’s injury, meaning that potentially valuable subjects go unnoticed as some 
finely differentiated neurological deficits are unlikely to be discovered by hospital 
staff. Assuming that all of these mitigating factors are bypassed, the researcher is 
still faced with a vanishingly small subject pool which may adversely affect the 
generalisability of their conclusions. 

TMS thus seems a remarkable innovation, one which suddenly widens the 
potential pool of subjects for cognitive neuropsychology to such a degree that any 
human being may become a viable subject regardless of the presence or lack of 
brain injury. However, there are a number of shortcomings to TMS which could 
cause one to question the validity of associated results. The inhibition of neural 
activity produced by TMS causes neurons to explode into such activity that normal 
firing patterns become impossible; while this certainly effectively disrupts activity 
in the brain area underneath the pulse, TMS is not necessarily capable of replicating 
the many and varied ways in which brain injuries may damage the neural tissue. 
Similarly, the electromagnetic pulse used to cause this inhibition of activity is only 
intended to affect brain areas which are near to the skull surface; the pulse does 
penetrate beyond those areas, but the effects of this excess inhibition are not fully 
understood. Despite these shortcomings, and the numerous areas in which standard 
examinations of brain-injury patients are superior, TMS has continued to become a 
rapidly-growing area of research in cognitive neuropsychology. 

The data derived from TMS is strongly theory-dependent in nature, in a similar 
fashion to computer simulation. In order to use this data as a reasonable method 
of answering empirical questions requires that TMS researchers adopt a theoretical 
‘backstory.’ This backstory is a framework that categorises TMS data and lesion data 
together as examples of ‘real’ brain-damage data. Despite the fact that TMS brain 
damage is generated artificially, it is deemed admissible as ‘real’ data as long as 
neuroscientists consider this data Artificial! rather than Artificial? in classification. 
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3.6.4.2 Neuroscience Studies of Rats 


Studies of rats have long been common within the field of neuroscience, given the 
complex methodological and ethical difficulties involved in studying the human 
brain. Though such studies allow much greater flexibility due to their more 
uncontroversial participants, certain fundamental limitations still prohibit certain 
techniques; while ideally, researchers would prefer to take non-invasive, in situ 
recordings of the neural activity of normal, free-living rats, such recordings are well 
beyond the current state of the art. 

Instead, neuroscientists must settle for studies of artificially prepared rat brains 
or portions thereof; such a technique would be useful if the researcher wishes to 
determine the activity of specific neural pathways given a predetermined stimulus, 
for example. One study of the GABA-containing neurons within the medial 
geniculate body of the rat brain required that the rats in question were anaesthetised 
and dissected, with slices of the brain then prepared and stimulated directly (Peruzzi 
et al. 1997). Through this preparatory process, the researchers hoped to determine 
the arrangement of neural connections within the rat’s main auditory pathway, and 
by extension gain some insight into the possible arrangement of the analogous 
pathway in humans. 

Given how commonplace such techniques have become within modern neu- 
roscience, most researchers in that community would not question the empirical 
validity of this type of procedure; further, the inherent difficulties involved in 
attempting to identify such neural pathways within a living brain makes such 
procedures the only known viable way to take such measurements. However, 
one might argue that the behaviour of cortical cells in such a preserved culture, 
entirely separate from the original brain, would certainly differ significantly from 
the behaviour of those same cells when functioning in their usual context. The entire 
brain would provide various types of stimulus to the area of cortex under study, with 
these stimuli in turn varying in response to both the external environment and the 
rat’s own cognitive behaviour. 

With modern robotics and studies of adaptive behaviour focusing so strongly 
upon such notions of embodiment and situatedness, the prevailing wisdom within 
those fields holds that an organism’s coupling to its external environment is 
fundamental to that organism’s cognition and behaviour (Brooks 1991). In that 
case, neuroscience studies of this sort might very well be accused of generating 
and recording ‘artificial’ neural data, producing datasets that differ fundamentally 
from the real behaviour of such neurons. How then do neuroscientists apply such 
data to the behaviour of real rats, and in turn to humans and other higher mammals? 

In fact, research of this type proceeds on the assumption that such isolation of 
these cortical slices from normal cognitive activity actually increases the experimen- 
tal validity of the study. By removing the neurons from their natural environment, 
the effects of external stimuli can be eliminated, meaning that the experimenters 
have total control over the stimulus provided to those neurons. In addition, the 
chemical and electrical responses of each neuron to those precise stimuli can be 
measured precisely, and the influence of experimenter error can be minimised. 
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With respect to the artificiality introduced into the study by using such means, 
neuroscience as a whole appears to have reached a consensus. Though removing 
these cortical slices from the rat is likely to fundamentally change the behaviour 
of those neurons as compared to their normal activation patterns, the individual 
behaviour of each neuron can be measured with much greater precision in this way; 
thus, the cortical slicing procedure can be viewed as Artificial', as the individual 
neurons are behaving precisely as they should, despite the overall change in the 
behaviour of that cortical slice when isolated from its natural context. Given the 
(perhaps tacit) agreement that these methods are Artificial! in nature, the data from 
such studies can be agreed to consist of ‘real’ neuroscience data. 


3.6.5 Artificial Data and the ‘Backstory’ 


The two examples above drawn from empirical science demonstrate that even com- 
monplace tools from relatively orthodox fields are nevertheless theory-dependent in 
their application, and further that such dependence is not necessarily immediately 
obvious or straightforward. The tacit assumptions underlying the use of TMS and rat 
studies in neuroscience are not formalised, but they do provide a working hypothesis 
which allows for the use of data derived from these methods as ‘real’ empirical data. 
The lack of a conclusive framework showing the validity of these techniques does 
not exclude them from use in the research community. 

This is of course welcome news for the strong Alife community, given that a 
tangible and formal definition of what constitutes a living system is quite far from 
being completed. However, strong Alife also lacks this tacit ‘backstory’ that is 
evident in the described empirical methods; without this backstory those methods 
may fall afoul of the Artificial'/Artificial? distinction. With this in mind, how might 
the Alife community describe a similar backstory which underwrites this research 
methodology as a valid method for producing new empirical datasets? 

The examples of TMS and rat neuroscience offer one possibility. In each of 
those cases, the investigative procedure begins with an uncontroversial example of 
the class of system under investigation (i.e., in the case of TMS, a human brain). 
This system is then prepared in some way in order to become more amenable 
to a particular brand of investigation; rat neuroscientists, by slicing and treating 
the cortical tissue, allow themselves much greater access to the functioning of 
individual neurons. In order to justify these modifications to the original system, 
these preparatory procedures must be seen as neutral in that they will not distort the 
resulting data to such a degree that the data becomes worthless. Indeed, in both TMS 
and rat neuroscience, the research community might argue that these preparatory 
procedures actually reduce some fairly significant limitations placed upon them by 
other empirical methodologies. 

At first blush such a theoretical framework seems reasonable for Alife. After 
all, other forms of ‘artificial life’ such as clones or recombinant bacteria begin 
with such an uncontroversial living system and ‘prepares’ it while still producing 
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a result universally accepted to be another living system. However, this does fall 
substantially short of one of the central goals of strong artificial life, as these systems 
certainly produce augmented datasets regarding the living systems involved, but 
they do not generate entirely novel datasets. While recombinant bacteria and 
mutated Drosophila may illuminate elements of those particular species that we 
may have been unable to uncover otherwise, the investigation of ‘life-as-it-could- 
be’ remains untouched by these efforts. 

In addition to these shortcomings, the preparatory procedures involved in a 
standard Alife simulation are quite far removed from those we see in standard 
biology (or the neuroscience examples discussed earlier). These simulated systems 
exist entirely in the digital realm, completely removed from the biological substrate 
of living systems; though many of these simulated systems may be based upon the 
form or behaviour of natural living systems, those systems remain separate from 
their simulated counterparts. 

Further, this computer simulation is ‘prepared’ in this digital substrate through 
a process of programming to produce this artificial system. Programming is 
inherently a highly variable process, the practice of which differs enormously from 
one practitioner to the next, in contrast to the highly standardised procedures of 
neuroscience and biology. The result of these preparations is to make the system 
amenable to creating life, rather than simply taking a previously-existing system 
and making it more amenable to a particular type of empirical observation. While 
characterising this extensive preparatory process as benign, as in neuroscience and 
biology, would be immensely appealing to the Alife community, the argument that 
preparing a computer and somehow creating life upon its digital substrate is a benign 
preparatory procedure is a difficult one to make. 


3.6.6 Silverman and Bullock’s Framework: A PSS Hypothesis 
for Life 


Thus, while we have seen that even orthodox empirical science can be considered 
strongly theory-dependent, the gap between natural living systems and Alife 
systems remains intimidatingly wide. From this perspective, characterising Alife 
as a useful window onto ‘life-as-it-could-be’ is far from easy; in fact, Alife appears 
dangerously close to being a quintessentially Artificial? enterprise. 

Newell and Simon’s seminal paper regarding the Physical Symbol System 
Hypothesis offers a possible answer to this dilemma (Newell and Simon 1976). 
Faced with a similar separation between the real system of interest (the intelligence 
displayed by the human brain) and their own attempt to replicate it digitally, 
the PSS Hypothesis offered a means of justifying their field. By establishing a 
framework under which their computers offer the ‘necessary and sufficient means’ 
for intelligent action, Newell and Simon also establish that any computer is only a 
short (albeit immensely complicated) step away from becoming an example of real, 
Artificial! intelligence. 
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The PSS Hypothesis also escapes from a potential theoretical conundrum by 
avoiding discussion of any perceived similarities between the behaviour of AI 
systems and natural intelligence. Instead Newell and Simon attempt to provide 
a base-level equivalence between computation and intelligence, arguing that the 
fundamental symbol-processing abilities displayed by natural intelligence are repli- 
cable in any symbol-processing system of sufficient capability. This neatly avoids 
the shaky ground of equating AI systems to human brains by instead equating 
intelligence to a form of computation; in this context, the idea that computers can 
produce a form of intelligence follows naturally. 

With this in mind, we return to Langton’s concept of the information ecology 
(1992). If we accept the premise that living systems have this ecology of information 
as their basis, then can we also argue that information-processing machines may also 
lead to the development of living systems? Following Newell and Simon’s lead, 
Silverman and Bullock (2004) offer a PSS Hypothesis for life: 


1. An information ecology provides the necessary and sufficient conditions for life. 
2. A suitably-programmed computer is an example of an information ecology. (Silverman 
and Bullock 2004, p. 5) 


Thus, assuming that the computer in question was appropriately programmed 
to take advantage of this property, then an Alife simulation could be regarded as 
a true living system, rather than an Artificial? imitation of life. As in Newell and 
Simon, the computer becomes a system that is a sort of blank slate, needing only 
the appropriate programming to become a repository for digital life. 

This PSS Hypothesis handily removes the gap between the living systems 
which provide the inspiration for Alife and the systems produced by an Alife 
simulation. Given that computers inherently provide an information ecology, an 
Alife simulation can harness that property to create a living system within that 
computer. Strong Alife then becomes a means for producing entirely new datasets 
derived from genuine digital lifeforms rather than simply a method for creating 
behaviours that are reminiscent of natural life. 


3.6.7 The Importance of Backstory for the Modeller 


As discussed in earlier sections about examples of the theoretical backstory in 
empirical disciplines, the presence of such a backstory allows the experimenter to 
justify the usefulness of serious alterations to the system under study. In the case 
of rat neuroscience and TMS research, their back-stories allow them to describe the 
changes and preparations they make to their subjects as necessary means for data 
collection. 

In the case of Alife, such a claim is difficult to make, as noted previously, given 
that the simulation is creating data rather than collecting it from a pre-existing source 
as is the case in our empirical examples. The PSS Hypothesis for Life avoids this 
difficulty by stating that any given computer hardware substrate forms the necessary 
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raw material for life; in a sense, our simulation is merely activating this potential to 
create an information ecology, and then collecting data from the result. Thus, we 
may say that our programming of the simulation has prepared this digital substrate 
for empirical data-collection in a manner similar to that of empirical studies. 

Of course, such a backstory is significant in scope, and could invite further 
criticism. However, such a statement is difficult to refute, given the unsophisticated 
nature of our current definitions for biological life, and a lack of agreement on 
whether life may exist in other substrates. Either way, the importance for the 
modeller is that such a simulation takes on a different character. The focus of 
such a theoretical justification is on implementing a model which produces some 
empirical data collection that may bear on our overall understanding of life, not on 
simply tweaking an interesting computational system to probe the results. The PSS 
Hypothesis for Life gives us some basis on which to state that this view of Alife 
models has some theoretical validity. 


3.6.8 Where to Go from Here 


Now that we have produced a theoretical backstory which underwrites Alife 
research as a means for generating new empirical data points, what is missing? The 
PSS Hypothesis for Life allows us to pursue simulations of this type as a means 
for understanding life as a new form of biology, investigating the properties and 
behaviours of entirely novel organisms. One can imagine how such endeavours 
could provide interesting insight for those interested in the broader questions of 
what makes something alive. 

However, none of this allows us to provide any direct relevance in our results in 
Alife to the biological sciences. The biologist who views our bird migration model, 
justified as an information ecology under the PSS Hypothesis for Life, as interesting, 
but irrelevant to the concerns of the empirical ornithologist. Indeed, if our simulation 
is producing merely a bird-like manifestation of digital life, then we have no basis 
on which to state that what we see in our results can tell us anything about real birds, 
completely removed from the virtual information ecology we have implemented. 

This issue becomes the focus of the next chapter, the final chapter of Part I. 
How might we apply Alife modelling techniques and agent-based models to the 
broader canvas of biological science? An in-depth discussion of methodological and 
theoretical issues related to modelling in population biology will provide us with the 
context necessary to begin to answer this question. The concerns of the strong Alife 
modeller as depicted in this chapter differ in several important respects to those of a 
modeller with a weak Alife orientation, and those concerns focus largely on creating 
a relationship to external empirical data rather than creating their own empirical data 
as the PSS Hypothesis for Life describes. 
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3.7 Summary and Conclusions 


ALife by its very nature is a field which depends upon the use and acceptance 
of artificially-generated data to investigate aspects of living systems. While some 
members of this research community have posited that artificial systems can be 
alive in a manner identical to biological life, there are numerous philosophical and 
methodological concerns inherent in such a viewpoint. 

The comparison with artificial intelligence laid out in this chapter illustrates 
some of the particular methodological difficulties apparent in ALife. ALife seeks to 
replicate properties of life which are heavily dependent on the biological substrate, 
in contrast with AI, which seeks to emulate higher-level properties of living 
organisms which seem more easily replicable outside of the biological realm. 

AI does not entirely escape the problem of artificiality in its data and methods 
however, nor for that matter does conventional empirical science. Some disciplines 
are able to make use of a great deal of artificial data by tying it to a theoretical 
‘backstory’ of a sort. ALife up to now has lacked this backstory, while AI has Newell 
and Simon’s PSS Hypothesis (Newell and Simon 1976) as one example of such a 
theoretical endeavour. 

Silverman and Bullock (2004) used Newell and Simon’s account as an inspiration 
for a PSS Hypothesis for life, a framework which could be used to underwrite 
strong ALife as a form of empirical endeavour. By accepting that life is a form of 
information ecology which does not depend exclusively on a biological substrate, a 
researcher signing up to this backstory may argue for the use of strong ALife data 
as a form of empirical data. Initially such an idea is appealing; after all, the strong 
ALife proponent seeks to create forms of ‘life-as-it-could-be’ in a digital space, and 
this variant of the PSS Hypothesis describes such instances of digital life as a natural 
consequence of life itself being an information ecology. 

However, this framework does not account for the more pragmatic methodologi- 
cal concerns which affect modelling endeavours of this type, and in fact modelling 
in general. Constructing a computational model requires a great deal more than a 
useful theoretical justification to function appropriately and provide useful data. As 
one example, accepting ALife simulations as a form of empirical enquiry does not 
simplify the task of relating that data to similar data derived from entirely natural 
systems. As exciting as the prospect of digital life may be, creating self-contained 
digital ecosystems which are unrelatable to natural ecosystems seems of limited 
utility for the biologist. Such issues, examined in the context of mathematical 
models in population biology in the following chapter, will provide greater insight 
into these crucial elements of our growing theoretical framework for simulation 
research. This framework can then be expanded to bear on our upcoming discussion 
of simulation for the social sciences in the second overall section of this text. 
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Chapter 4 
Modelling in Population Biology 


4.1 Overview 


While the previous chapter focused on lofty theoretical concerns regarding a 
particular variety of computational modelling methodology, the current chapter 
turns toward the pragmatic concerns facing simulation researchers in the discipline 
of population biology. Mathematical modelling has been popular for some time with 
population biologists, providing as it does a relatively simple means for tracking 
trends in natural populations, but the recent advent of agent-based modelling has 
added new possibilities for those in the community who seek a more detailed picture 
of natural populations than mathematical models can provide. 

Population biologists long had doubts about the general usefulness of simulation, 
as evidenced by Richard Levins’ criticisms of mathematical modelling techniques 
(Levins 1966). Levins argued that such methods faced fundamental limitations 
which would prevent simulation researchers from producing usable detailed models 
of populations. In the years following, debate has continued over Levins’ ideas, 
particularly given the possible application of those same criticisms to more sophis- 
ticated computational modelling methods (Orzack and Sober 1993; Odenbaugh 
2003). 

After elucidating the most vital points raised during the lengthy Levins debate, 
this chapter will investigate how these arguments pertain to methodologies such 
as Alife and agent-based modelling. While Levins did originally focus upon 
mathematical modelling, the pragmatic concerns he raises regarding modelling as 
an enterprise remain useful when attempting to construct detailed, yet tractable, 
simulation models of natural phenomena. 


Figures in this chapter were provided by Prof Seth Bullock, and are reprinted with permission. 
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These concerns will be relevant as well during the remaining chapters, in which 
we will begin the Part II of this text and delve into the issues surrounding social 
simulation. Levins’ pragmatic concerns will allow us to draw a contrast between the 
issues of agent-based modelling in Alife and biology with the concerns of modellers 
within the social sciences. The framework described here will give us valuable 
background for these comparisons, demonstrating methodological issues that are 
important both to empirical research and to agent-based modelling. This framework 
will also be useful to keep in mind in Part III, given the methodological similarities 
between population biology and demography. 


4.2 Levins’ Framework: Precision, Generality, and Realism 


4.2.1 Description of Levins’ Three Dimensions 


Levins’ 1966 paper “The Strategy of Model Building in Population Biology’ (Levins 
1966) triggered a long-running debate within the biological and philosophical 
community regarding the difficulties of successful mathematical modelling. Levins 
argued that during the construction of such models, population biologists must face 
the challenging task of balancing three factors present in any model of a natural 
system: precision, generality, and realism. He posits that only two of these three 
dimensions of a given model can be present to a significant degree in that model; 
strengthening two of these dimensions comes only at the expense of the third. 

For example, to return to our bird migration example, a very precise and realistic 
model of the migration habits of a particular species of bird would necessarily lose 
generality, as such a model would be difficult to apply to other species. Likewise a 
broad-stroke model of migration habits would lose precision, as this model would 
not be able to cope with the many possible variations in migratory behaviour 
between species. Thus, a modeler must face the prospect of sacrificing one of these 
three dimensions when producing a model of a biological system, as no model could 
successfully integrate all three into a cohesive whole. 

In order to demonstrate more clearly the contrasting nature of these three 
modelling dimensions, Levins outlines a hierarchy of models which biologists 
can construct within this framework (Table 4.1). First, Type I models (referred to 
hereafter as L1) sacrifice generality for precision and realism; these models may 
be useful in situations in which the short-term behaviour of specific populations 
is relevant to the problem at hand. Second, Type II models (L2) sacrifice realism 


Table 4.1 Summary of 


i : Levinsian modelling framework 
Levins’ three modelling types 


L1 | Precision and realism at the expense of generality 
L2 | Generality and precision at the expense of realism 
L3 | Generality and realism at the expense of precision 
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for generality and precision; these models could be useful in ecological models, 
for example, in which even heavily idealised models may still produce interesting 
results. Finally, Type I models (L3) sacrifice precision for generality and realism; 
models of this sort could be useful as general population biology models, in which 
realistic behaviour at the aggregate level is more important than accurate represen- 
tation of individual-level behaviour. In a follow-up to his original description of 
these three dimensions, Levins clarifies his position that no model could be equally 
general, realistic and precise simultaneously, saying that a model of that sort would 
require such a complex and interdependent system of linked equations that it would 
be nearly impossible to analyse even if it did produce reasonable results (Levins 
1968). 

At the time of its publication, such a lucid description of the practicalities 
of model-building within population biology was unusual. Wimsatt (2001) lauds 
Levins for the way in which he ‘talked about strategies of model-building, which 
philosophers never discussed because that was from the context of (presumably non- 
rational) discovery, rather than the (seemingly endlessly rationalise-able) context of 
justification’ (p. 103). Levins (1968) book sought to further refine his ideas, though 
since then philosophers within biology and related disciplines have begun to take 
his ideas on board. 

Of course Levins did produce this argument as a specific criticism of mathemati- 
cal modelling efforts, but his description of these three key modelling dimensions is 
general enough in scope to provide a wealth of interesting implications for modern 
computational modelling. In an era when computer simulation is fast becoming 
the methodology of choice across numerous disciplines, such a cohesive argument 
outlining the fundamental limitations of modelling natural systems remains vital, 
and the strong criticisms of this framework coming from various corners of biology 
and philosophy serve an equally important role in bringing Levins’ ideas up-to-date 
with the current state-of-the-art. 


4.3 Levins’ L1, L2 and L3 Models: Examples and Analysis 


4.3.1 L1 Models: Sacrificing Generality 


As in the variation of our bird migration example presented above, studies which 
focus upon realism and precision must necessarily decrease their generality under 
Levins’ framework. Given that our hypothetical realistic and precise bird-migration 
model aims only to examine the migration habits of a particular bird species, the 
modeler will have greater difficulty in applying his results to other species. This is 
not to say that his data may not also explain by coincidence the migration habits of 
other bird species, but rather that his model displays those habits using the specific 
data of one species and does not set out with the intent of describing migration in a 
broader context. 
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4.3.2 L2 Models: Sacrificing Realism 


An L2 model under Levins’ framework emphasizes generality and precision at the 
expense of realism. In this case, the bird modeler would reduce his connection to real 
data, hoping to produce a model which, while displaying deviations from empirical 
data regarding specific species or short-term migration situations, can provide a 
useful set of parameters or equations that can describe the phenomenon of bird 
migration in an idealised context. 

Levins notes that physicists entering the field of population biology often operate 
in this way, hoping that their idealised models may stray closer than expected to the 
equations that govern the overall behaviour they are investigating (Levins 1966). 
While this may seem misguided initially, a successful validation of data from such a 
model using empirically-gathered data from the real system could certainly provide 
a valid means for revising the model in a more realistic direction. In this way we 
might say that the physicist is tacitly accepting the fundamental limitation of such 
an idealised model by making no attempt to attach realism to his formalisation of 
the problem, and instead seeks to reach a more realistic perspective on the problem 
by integrating insights produced from ‘real’ data. 


4.3.3 L3 Models: Sacrificing Precision 


Taking our bird example in a slightly different direction, imagine that the population 
biologist chooses to eschew precision in favour of generality and realism. In this 
case, he is not concerned with matching his model’s data with that of the real- 
world system; instead he is interested in more simple comparisons, describing the 
behaviour of the birds in response to changes in environmental stimuli for example. 
The simulation may state that the birds migrate more quickly during particularly 
cold winters, for example, which allows him to draw a general conclusion about the 
behaviour of real birds during such times; the actual number of birds migrating, or 
the manner in which they migrate, is unimportant for drawing such conclusions 
in this context. Levins views this method as the most fruitful for his field, and 
accordingly follows this methodology himself. 


4.4 Orzack and Sober’s Rebuttal 


4.4.1 The Fallacy of Clearly Delineated Model Dimensions 


Orzack and Sober’s rebuttal to Levins’ proposals begins by attempting to clarify 
Levins’ descriptions of generality, precision and realism (Orzack and Sober 1993). 
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In their view: generality refers to a greater correspondence to real-world systems; 
realism refers to the model’s ability to take account of more independent variables; 
and precision refers to the model’s ability to make valid predictions from the relevant 
output parameters. Further, Orzack and Sober contend that a model’s parameters can 
be specified or unspecified, specified models being those in which all parameters are 
given specific values. Importantly they consider unspecified models as necessarily 
more general, given the lack of assigned parameter values. Orzack and Sober argue 
that these characteristics of models interact in unexpected ways, and the resulting 
deviations from Levins’ framework may provide a means for modelers to escape 
that framework. 


4.4.2 Special Cases: The Inseparability of Levins’ Three 
Factors 


With these ideas in mind, Orzack and Sober propose that if any model is a special 
case of another, then the special case must necessarily start from the same position 
amongst Levins’ three dimensions as the original model (at least according to their 
definitions of generality, realism and precision). The new model thus will gain in 
one or more of the Levinsian dimensions without losing in the others (which would 
be the case under Levins’ formulation). This obviously clashes with Levins’ thesis, 
and as a result Orzack and Sober claim that, in some cases, these three properties 
are not connected, meaning that a trade-off between them is not necessary. 

In this way, assuming Orzack and Sober’s criticisms hold true, the modeler could 
continually refine a generic modelling framework related to a certain problem to 
evade Levins’ stated problems. For example, if a general unspecified model of our 
bird-migration problem was refined by taking into account new biological data, thus 
producing potential fixed values for the parameters of the model, then this specified 
version of that model has increased its realism without losing the generality and 
precision of its more generalised parent. Orzack and Sober would argue that the 
model has necessarily started from the same level of correspondence to real-world 
systems, ability to account for independent variables, and ability to make predictions 
as its parent, and as a special case of that model has gained in realism without 
sacrificing the other two dimensions in the process. 

Despite Orzack and Sober’s vehement position that Levins’ framework of mod- 
elling dimensions is fundamentally flawed, their criticisms do seem unsatisfactory. 
They neglect much mention of Levins’ points regarding the practicalities of model- 
building, and how these three dimensions can influence that process, and his 
related points regarding the inherent difficulty involved in analysing overly-complex 
models. 
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4.5 Resolving the Debate: Intractability as the Fourth Factor 


4.5.1 Missing the Point? Levins’ Framework as Pragmatic 
Guideline 


Thus, despite the validity of Orzack and Sober’s concerns regarding Levins’ three 
dimensions as hard constraints on the modelling process, their complete deconstruc- 
tion of his hypotheses in rigorous semantic terms may be excessive. Levins’ points 
regarding the practicalities of model-building seem vital to an understanding of his 
underlying convictions; his framework seems to provide a summation of what, in 
his view, influences the usefulness and applicability of a given model to a particular 
problem. 


4.5.2 Odenbaugh’s Defence of Levins 


Odenbaugh (2003) defends Levins against the semantic arguments of Orzack and 
Sober by first noting the general peculiarities of Levins’ writing style which may 
lead the reader to draw more forceful conclusions than those originally intended. As 
a Marxist, Levins’ mention of ‘contradictory desiderata’ in relation to the tension 
between these three modelling dimensions is not used to imply that these properties 
are mutually exclusive or entirely logically inconsistent, but rather that these model 
properties can potentially influence one another negatively. Such terminology is 
common amongst those of a Marxist bent, but other readers may assume that Levins’ 
description of contradictory model properties does imply a hard mutual exclusivity. 

Odenbaugh also concedes that theoretically one may find models in which 
trade-offs between generality, realism and precision are unnecessary, or perhaps 
even impossible. However, he argues that Levins’ ideas are intended as pragmatic 
modelling guidelines rather than definitive constraints as implied by Orzack and 
Sober; Levins concerns himself not with formal properties of models but instead 
with concepts that may guide our construction of sensible models for natural 
systems. Odenbaugh concludes his analysis of Levins’ thesis by summarising his 
view of Levins’ intended purpose: 

... Levins’ discussion of tradeoffs in biological modelling concerns the tensions between 

our own limitations with respect to what we can compute, measure and understand, the 


aims we bring to our science, and the complexity of the systems themselves. (Odenbaugh 
2003, p. 17) 


Thus in one sense Levins’ ideas stretch beyond the formal guidelines Orzack 
and Sober accuse him of constructing, but instead encompass statements regarding 
the fundamental limits of tractability and understanding which underlie all mod- 
elling endeavours. 
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4.5.3 Intractability as the Fourth Factor: A Refinement 


Odenbaugh’s defence of Levins gives us an important additional concept that may 
be useful for refining Levins’ modelling framework. He mentions the importance 
of computability, and the tensions that such practical concerns can produce for 
the modeller. Of course, as already intimated by Levins, a model’s amenability 
to analysis is also of primary concern for the modeller in any field. Odenbaugh 
makes a valuable point, demonstrating that Levins’ modelling dimensions are of 
primary importance to the modeller seeking pragmatic guidance in the task of 
constructing a model that can be successfully analysed. Here we use the term 
‘tractability’ to refer to this characteristic of a given model; while other terms such 
as ‘opacity’ may also be appropriate, “tractability’ carries the additional connotation 
of mathematical or computational difficulty, and thus seems preferable in the context 
of Odenbaugh’s formulation. 

While Odenbaugh’s synthesis of Levins’ claims does greatly diminish the effect 
of Orzack and Sober’s analysis on the validity of Levins’ claims, one might take this 
rebuttal even further, pointing to Levins’ ideas regarding tractability as a potential 
solution to this semantic dilemma. While the hypothetical models proposed by 
Orzack and Sober are certainly logically possible, they ignore one of Levins’ central 
concerns: if one fails to make a trade-off between these three dimensions while 
constructing a model, the model will become extremely complex and thus difficult, 
or impossible, to analyse. In a sense, tractability becomes a fourth dimension 
in Levins’ formulation: the closer one gets to a complete balance of generality, 
precision and realism, the further one moves towards intractability. 

To clarify this relationship somewhat, we might imagine Levins’ three model 
dimensions as forming the vertices of a triangle, with each vertex representing one 
of the three dimensions (Fig. 4.1). When we construct a model, that model occupies 
some place within that triangle, its position indicating its relative focus on each of 


Fig. 4.1 Levins’ three model 
dimensions 


Generality 
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Realistic 


Precise 
General 


Tractable 


Fig. 4.2 Levins’ modelling dimensions with the addition of tractability 


those three dimensions. In this arrangement, moving a model toward one side of 
the triangle (representing a focus upon two of the three model dimensions) would 
necessarily move it away from the third vertex, representing the trade-off between 
generality, precision and realism. 

However, as per this clarification which places tractability as a fourth dimension 
in Levins’ modelling framework, we may add a fourth vertex to our triangle, 
extending it to a tetrahedron. In this arrangement a model which tends toward the 
centre of the triangle, representing an equal balance of all three initial dimensions, 
could be seen as approaching that fourth dimension of intractability (Fig. 4.2). The 
placement of intractability into the modelling dimensions is critical, as this allows 
for models which balance all three Levinsian factors to exist; the cost of doing so 
becomes a loss of tractability rather than an outright declaration of impossibility. 

Of course, any diagram of such an interplay between such broad concepts will 
fall short of demonstrating the full import of the ideas described here, not least 
because such a diagram requires an illustration of four interacting dimensions. 
This diagram does not entirely avoid such difficulties, as one can clearly imagine 
paths within such a space which avoid the pitfalls described by Levins. However, 
this representation proves preferable to other potential approaches due to its useful 
illustration of the space of models described by Levins’ dimensions. If we imagine 
that any movements of the model in question toward the balanced centre of the 
Levinsian triangle result in a related movement toward tractability and the upper 
vertex of our tetrahedron, then the representation becomes more clear. 

In any case, this diagram of the delicate balance of all four factors seeks to 
express Levins’ idea, as clarified by Odenbaugh (2003), that his pragmatic concerns 
represent a fundamental limitation on the part of the researcher rather than the model 
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itself. While a model may potentially encompass all three dimensions equally, the 
parameters of the model will become intractable and impossible to analyse to the 
point of exceeding the cognitive limitations of the scientist. The exponentially 
more difficult task of analysing the model in such a circumstance eliminates the 
time-saving benefits of creating a model in the first place, necessitating a carefully- 
considered balancing of the computational demands of a model with its consistency 
along these dimensions of realism, generality and precision. 


4.6 A Levinsian Framework for Alife 


4.6.1 Population Biology vs. Alife: A Lack of Data 


While the pragmatic concerns illuminated by Levins are clearly relevant for the 
biological modeller, there are significant differences between the mathematical 
modelling approach and the Alife approach as discussed in the previous chapter. 
Mathematical models for population biology often begin with field-collected data 
that is used to derive appropriate equations for the model in question. For example, 
our bird migration researchers may wish to establish a model of migration rates in a 
bird population using these more traditional methods, rather than the computational 
examples of possible bird migration studies examined in the previous chapter. In 
this case, they would first perform field studies, establish patterns of behaviour for 
those populations under study, and tweak their model parameters based upon that 
initial empirical data. 

By contrast, many Alife studies may begin with no such background data. 
For example, a modeller that wishes to examine the development of signaling 
within animal populations would have great difficulty obtaining empirical data 
relevant to such a broad question. Looking at such an extensive evolutionary 
process is naturally going to involve a certain scarcity of usable empirical data; 
after all, watching a species evolve over thousands of generations is not always a 
practical approach. Thus the researcher must construct a simulation using a set of 
assumptions which best fit current theoretical frameworks about the development 
of signalling. Given the methodology of agent-based simulation in particular, many 
of these simulations will be programmed using a ‘bottom-up’ approach in which 
low-level factors are modeled in the hopes of producing high-level complexity from 
the interactions of those factors. 

Of course, in order to produce results in a reasonable period of time which are 
amenable to analysis, such models must be quite simplified, often making use of 
highly-idealised artificial worlds with similarly simplified rules governing those 
organisms within the simulation. With these simulations producing reams of data 
relating only to a highly-simplified version of real-world phenomena, will these 


70 4 Modelling in Population Biology 


models necessarily obey Levins’ thesis regarding the trade-off between generality, 
precision and realism? For example, need an Alife modeler concern himself with 
such factors as ‘precision’ when the model cannot be, and is not intended to be, a 
realistic representation of a specific animal population? 


4.6.2 Levinsian Alife: A Framework for Artificial Data? 


Despite the inherent artificiality of data produced within Alife, Levins’ framework 
can still apply to such a methodology. An Alife model designed in a highly 
abstract fashion, as in the above communication example, can simply be placed 
amongst the L3 models, as it aims to produce data which illuminates general trends 
in organisms which seek to communicate, rather than orienting itself towards a 
particular population. Some models may also use a specific species as a basis 
for an Alife simulation, thus leading that model away from generality and toward 
realism and precision. Similarly, overly complex simulations, like overly complex 
mathematical models, become difficult to compute and hence cumbersome; this 
implicitly limits the precision of an artificial life simulation. 

However, as hinted above and as described by Orzack and Sober (1993), the 
simulation researcher will run into trouble regarding the meaning of these three 
dimensions within this trade-off. One might question the utility of regarding 
any simulation which deals with broad-stroke questions using idealised worlds 
and populations as ‘realistic’ or ‘precise’ in any conventional sense. Despite this 
problem, as Taylor describes, Levins did provide a caveat that ‘models should 
be seen as necessarily ‘false, incomplete, [and] inadequate’ but ‘productive of 
qualitative and general insights’ (Taylor 2000, p. 197). With this in mind we 
might take Levins’ framework as a very loose pragmatic doctrine when applied to 
computer simulation; we might ask whether a simulation resembles reality rather 
than accurately represents reality. 


4.6.3 Resembling Reality and Sites of Sociality 


This pragmatic application of the general thrust of Levins’ framework might seem 
a relief to simulation researchers and theorists, but still a great problem remains. 
Determining whether a simulation resembles reality is far from straightforward, and 
might be evaluated in a number of different ways. The evaluation of a model as 
a reasonable resemblance to the real system which inspired it could appear quite 
different depending on the design of the simulation itself or the appearance of the 
data produced by the simulation, assuming of course that any conventional statistical 
data is produced at all. This can easily result in rather overdramatic or otherwise 
inappropriate conclusions being drawn from the study in question. 
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Take, for example, two groups of researchers, each attempting a realistic and 
enlightening model of our migrating birds. Group A models the birds’ physi- 
cal attributes precisely using three-dimensional computer graphics, models their 
movements in exacting detail using motion-capture equipment on real birds, and 
translates it all into a breathtaking display of flying birds on the monitor. Meanwhile, 
Group B models the birds only as abstract entities, mere collections of data which 
move, migrate and reproduce only in a computational space created by skillful 
programming. The model uses pre-existing data on bird migrations to produce new 
insights about why and how birds tend to migrate throughout the world. 

In this example, how can we say which model is more realistic or has a more 
pronounced resemblance to reality? Some may claim that Group A’s model provides 
a provocative and insightful look into how birds move through the air, while others 
may claim that group B’s model produces more exciting data for biologists and 
zoologists despite the abstractions applied to the birds within that simulation. In 
reality such a comparison is virtually meaningless without such a context, as each 
model is designed for an entirely different purpose. Each model may resemble 
certain elements of bird behaviour in certain enlightening ways, but neither takes 
into account the effects of factors addressed in the other model. 

Taylor (2000) continues upon this line of thinking, noting the existence of varying 
‘sites of sociality’ in which modelers must operate. These sites correspond to 
points at which social considerations within a scientific discipline begin to define 
parameters of the model, rather than considerations brought about by the subject 
matter of the model itself. Thus, if a zoology conference views Group A’s graceful 
three-dimensional birds, they are likely to receive acclaim for their models accuracy. 
Similarly, if Group B presents their abstract migration model to a conference of 
population biologists, they’re more likely to receive a warm reception for their work. 
In either case, if Group A and B switched places, those communities would likely 
make their presentations rather more difficult. 

Such discussions are very relevant for many researchers within the computational 
modelling community. For example, Vickerstaff and Di Paolo’s model of path 
integration in ants provides one instance of a model crossing between research 
communities (Vickerstaff and Di Paolo 2005). The model is entirely computational 
in nature, and thus provides no hard empirical data, and yet the model was accepted 
and published by the Journal of Experimental Biology. The authors describe the 
process of acceptance as an interesting one; the editors of the journal needed to 
be convinced that the model was relevant enough and the ideas it presented novel 
enough to warrant the interest of the journal’s readership. 

Taylor’s discussion of social considerations within given scientific disciplines 
makes this an interesting happening. If the model had been pitched slightly 
differently, say by using much more complex neural models, or making vastly 
inappropriate biological assumptions in the model’s impelementation, then this 
group would likely have been less receptive regarding the model’s publication for 
their readership. For such a community, biological relevance is highly important; 
consider the contrast with journals like Artificial Life, for example, in which the 
presentation of computational models is far more broad in nature. 
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4.6.4 Theory-Dependence Revisited 


This idea of sites of sociality harkens back to the previous chapter’s discussion of 
theory-dependence in science. The construction of any model, whether mathemat- 
ical, computational, or conceptual, is an inherently theory-dependent exercise. All 
varieties of models must simplify elements of reality in order to study it in these 
reduced forms; otherwise the modeler has gained little from avoiding the more 
costly and time-consuming methods of empirical data-gathering which may produce 
similar results. 

With all of these abstractions and simplifications, however, a certain resemblance 
to reality must be maintained. Our detailed three-dimensional model of flying birds 
maintains a useful resemblance to real-world flying behaviours; likewise, our model 
of migrating birds, while more abstracted, may retain that useful resemblance 
by demonstrating migration patterns observed in real bird populations, or even 
demonstrating the functioning of this higher-level behaviour amongst a generalised 
population (as is the goal in much of Alife). 

Determining whether a model resembles reality is thus an inherently pragmatic 
process, depending upon both the structure of the model itself and the structure of 
its intended audience. Some questionable elements of a model may become obvious 
when the model is presented, but on the other hand subtle assumptions that may 
affect the model’s results will be far less obvious. This makes evaluation of that 
model even more difficult, as extracting the nature of a model’s theory-dependence 
is not always straightforward when observing the results, and even then different 
audiences may perceive those theory-dependent elements differently. Of course this 
does not imply complete relativism in scientific results, but this view does stress the 
importance of the audience for those results in determining the overall success of a 
modelling effort. 


4.7 Tractability Revisited 


4.7.1 Tractability and Braitenberg’s Law 


As we have seen, our revised four-factor version of Levins’ framework gives us 
a pragmatic theoretical background for building computer simulations in which all 
such simulations are fundamentally constrained by problems of tractability. This can 
lead to such situations as those described above, in which greatly-simplified models 
are far more tractable but also in turn much more abstracted, making for potentially 
provocative and misleading results. However, as indicated in our examinations 
of Alife simulations, there does seem to be a greater degree of such grandiose 
approaches in computer simulation than in mathematical modelling. What about 
computer simulation leads some to attempt to achieve provocative results at the 
expense of tractability? 
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The answer may lie in a fundamental difference in the construction of computer 
simulations and mathematical models. Braitenberg’s influential work Vehicles 
(Braitenberg 1984) provides an insight into such elements of model construction. 
In this work, he outlines a series of thought experiments in which he describes the 
construction of highly-simplified vehicles, each incorporating only basic sensors, 
which can nevertheless display surprisingly complex behaviour that is startlingly 
reminiscent of living creatures. The most famous example of these is his light- 
seeking vehicle, which is drawn towards a light-source simply by having a light- 
sensor directly hooked to its wheels, producing a simple light-seeking behaviour 
reminiscent of that of moths. 

During the course of the book Braitenberg proposes an idea known as the ‘law 
of uphill analysis and downhill invention, describing the inherent difficulty in 
capturing complex behaviour from an external perspective: 


...1t is much more difficult to start from the outside and try to guess internal structure just 
from the observation of behaviour.... It is pleasurable and easy to create little machines 
that do certain tricks. [But] analysis is more difficult than invention just as induction takes 
more time than deduction. 


For Braitenberg then, his vehicular ‘experiments in synthetic psychology’ may 
have provided an easier path to insight than dutiful, and most likely tedious, 
observation and inference from the behaviour in a real system. 

Braitenberg’s Law illuminates the difference between the factors of analysis and 
invention in mathematical models and computer simulations. When constructing 
a mathematical model, this construction (invention) is quite tightly coupled to the 
analysis that will result from the model, as a mathematical model which is nearly 
impossible to analyse will be correspondingly difficult to modify and tweak. As 
Braitenberg describes, the ‘invention’ is easier than the analysis, and in the case of a 
mathematical model the difficulty in analysis will also affect that invention process. 

When constructing a computer simulation the disconnect between these two 
factors is much larger. In the case of agent-based models, each agent is constructed 
as a confluence of various combined assumptions laid out during the ‘invention’ 
process, and once the simulation begins those agents will interact in non-trivial 
ways both with other agents and with the related virtual environment. This results 
in a troublesome opacity for the analyst, as the mechanics and the results of the 
simulation do not necessarily have a direct correspondence. 

For example, say our abstracted bird migration model from earlier used neural 
networks to drive the simulated birds’ movements and their responses to environ- 
mental stimuli. When analysing the results, determining how those complex sets 
of neural connection weights correspond to the observed higher-level behaviours is 
a herculean task, and even more so when we consider how those weights change 
in response to a complex and changeable virtual environment. Even though these 
network weights influence the behaviour of each virtual bird, and thus have a direct 
impact upon that simulated environment, the relationships of those weights to the 
simulation results are far from obvious. 
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Clearly then the coupling between the synthesis and analysis of an agent-based 
model is significantly looser than that of a mathematical model. As these simulations 
continue to grow in size and complexity, the resultant opacity of these models could 
become so severe that synthesis actually forgoes analysis. In this nightmare scenario 
the process of analysing such an opaque simulation can become so time-consuming 
that the time saved by running that simulation is eclipsed by the time wasted when 
trying to penetrate its opacity. 


4.7.2 David Marr’s Classical Cascade 


A useful perspective on this tractability problem comes from artificial intelligence 
via David Marr’s description of hierarchical levels of description (Marr 1972). Marr 
discusses three levels at which the researcher develops explanations of behaviour: 
level 1 (hereafter referred to as M1), in which the behaviour of interest is identified 
computationally; level 2 (M2), in which an algorithm is designed that is capable 
of solving this behavioural problem; and level 3 (M3), in which this behavioural 
algorithm is implemented (Table 4.2). The agent-based simulation methodology 
tends to deviate from this hierarchy, however, as many such models are not so 
easily described in terms of a single algorithm designed to solve a single behavioural 
problem. 

Peacocke’s extended version of Marr’s hierarchy seems more applicable to our 
concerns (Peacocke 1986). He adds a level between M1 and M2 in Mart’s hierarchy, 
appropriately termed level 1.5 (M1.5), in which the modeler adds to his initial 
M1 specification of his problem in computational terms by drawing upon a ‘body 
of information’ appropriate to the behaviour of interest. For example, our bird 
migration example from earlier may incorporate an artificial neural network which 
drives the behaviour of individual agents within the model, in addition to the other 
aspects of the agents’ environment that drives their behaviour. In this case we define 
M1.5 to include both the original formulation of the model and the resources the 
model draws upon to solve the behavioural problem of interest, which for our bird 
migration model would be the artificial neural network driving each agent. 

From constructing our bird migration model, we would naturally proceed to run- 
ning the simulation and obtaining results for analysis. However, under Peacocke’s 
analysis, we have essentially just skipped the M2 portion of Marr’s hierarchy 
entirely. We specified our problem computationally, constructed a model using an 
appropriate body of information, and then proceeded directly to the implementation 


Table 4.2 Summary of 


Marr’s levels of description 
Marr’s levels 


M1 | Problem identified computationally 
M2 | Algorithm designed to solve the problem 
M3 | Behavioural algorithm is implemented 
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step without developing a distinct algorithm. Unfortunately, M2 is identified as 
the level of this hierarchy which produces the most insight into the behaviour of 
interest; that M2 algorithmic understanding of the problem would normally allow 
the researcher to develop a more in-depth explanation of the resultant behaviour. 


4.7.3 Recovering Algorithmic Understanding 


With this problem in mind, agent-based modelers may choose to attempt to step 
backwards through the classical cascade, proceeding from M3 back to M2 in order 
to achieve this useful algorithmic understanding that is denied to them by this 
simulation methodology. However, as noted earlier, these models already sit in a 
difficult position between synthesis and analysis. How might we jump backwards to 
divine this algorithm, when our model may already be too complex in its synthesis 
to promote the type of analysis which may produce that algorithmic understanding? 

Perhaps a better course would be to ask why an agent-based modeler needs 
this type of explanation at all. Agent-based models are most often developed to 
produce emergent behaviour, which by its very nature is not particularly amenable 
to algorithmic reduction. A complex high-level behaviour deriving from simple low- 
level interactions is not something easily quantified into a discrete form. 

Similarly, the rules which define a given simulation may not lend themselves 
easily to such analysis either; the results of a simulation may be too opaque to 
narrow down the specific influences and effects of any of the low-level rules 
which produce that emergent behaviour. Alternatively, if the researcher produces 
a model which bears a useful resemblance to the behaviour of interest as seen in 
the natural system, then this validation against empirical results may be enough of 
a confirmation of the model’s concept that these complicated analyses may not be 
viewed with much importance. 


4.7.4 Randall Beer and Recovering Algorithmic Understanding 


Of course, while recovering this algorithmic understanding may be difficult for the 
agent-based modeller, this is by no means an impossible task. For example, even the 
simplest artificial neural network can be quite opaque when attempting a detailed 
analysis of the behaviour of that network; neural network connection weights can 
provide novel solutions to certain tasks within a model, but unraveling the meaning 
of the connection weights in relation to the problem at hand can take a great deal of 
effort and determination. 

Despite these difficulties, such analyses have been performed, but only after a 
great deal of concentrated effort using novel cluster-analysis techniques and similar 
methods. A seminal example of this is Randall Beer’s examination of a minimally 
cognitive agent performing a categorical perception task (Beer 1995, 2003b). Beer’s 
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agent employs a network of seven neurons to discriminate between falling objects, 
tasked with catching circular objects while avoiding diamond-shaped ones. The best 
evolved agents were able to discriminate between objects 99.83% of the time after 
80 generations of evolution. Beer’s goal was to develop a minimally-cognitive agent 
performing a vital, but simple, task, then analyse the simulation to understand how 
the task is successfully performed, and the psychophysical properties of the agent. 

In order to study the agent and its performance as part of this coupled brain-body- 
environment system, Beer analyses the agent and its behaviour in excruciating detail 
using the language of dynamical systems theory. He argues that such analysis must 
form the backbone of a proper understanding of any given simulated phenomenon; 
as Beer states, ‘it is only by working through the details of tractable, concrete 
examples of cognitive phenomena of interest that we can begin to glimpse whatever 
general theoretical principles may exist’ (Beer (2003b), p. 31, also see Beer 2003a). 

Beer’s analysis is not without its detractors, even in this case. Webb provides a 
lengthy treatise on the lack of biological relevance in Alife models, using Beer as 
a central example (Webb 2009). He contends that while Beer’s analysis is amply 
justified in his paper and the relevant responses (Beer 2003a,b), he contradicts 
himself when discussing the issue of the relevance of his model. He notes Beer’s 
contention that he does not seek to create a serious model of categorical perception, 
and yet this contradicts his later statements regarding the value of his model in 
discovering properties of categorical perception; thus, Webb says, “empirical claims 
are being made about the about the world based on the model results’ (Webb 2009, 
p. 11). In Webb’s view, Beer attempts to take his analysis a step beyond where 
he himself claims it has actual validity. He does not wish his simulation to be 
considered a true model of a natural system, and yet wishes also to use his analysis 
to make claims as if it is such a model. 

Webb’s discussion is an important one, as it bears upon our further discussion 
of the difficulties of artificial worlds. Beer’s analysis is detailed, and he largely 
succeeds in avoiding a major pitfall of models of this type and their lack of 
algorithmic understanding. Yet, in the process of doing so, he also places emphasis 
for the reader on another concern, namely the relevance of such models outside of 
an exercise in formal analysis or the study of a simulated system for its own sake. 


4.7.5 The Lure of Artificial Worlds 


This discussion of algorithmic understanding, or the lack thereof, in agent-based 
simulations provides an insight into the lure of these artificial worlds for the research 
community. By constructing these models the researcher has essentially replaced the 
nuances of the real system of interest with a simplified artificial version, avoiding 
the complexities and occasional impossibilities (both practical and financial) of 
attempting certain empirical studies. In addition, as noted earlier and as discussed by 
Webb, the study of this artificial world becomes a sort of pseudo-empirical activity 
on its own as the researcher must tweak parameter settings and observe many runs 
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of this compressed and abstracted version of the natural system. Of course these 
models can exhibit the aforementioned opacity to analysis, but many leave the black- 
box internals of their simulations alone, content to skip what is normally an essential 
step in the modelling of behaviours, and further a step acknowledged by many to 
be essential to developing an understanding of the behaviour in evidence, as noted 
above. 

This tactic for avoiding the complex questions raised by a lack of algorithmic 
understanding in agent-based modelling is also helpful for those troubled by Levins’ 
framework. For a relatively small initial investment for the researcher (small at least 
in comparison to large-scale empirical studies), one appears to gain a great deal of 
insight through this artificial world, and perhaps even avoid the issues of realism, 
generality and precision that limit the application of more traditional modelling 
methods. However, the fourth dimension of Levins’ modelling framework adds 
a tractability ceiling to any modelling endeavour, regardless of scope or method; 
while an artfully-constructed model may allow an equal balance of the three basic 
Levinsian factors, this tractability ceiling will greatly limit the ability of that model 
to produce useful and analysable data. 

Initially the use of artificial worlds in computational models appears to propel 
the model in question through the tractability ceiling. The modeler can integrate the 
three Levinsian factors much more easily in an artificial world, lacking as it does the 
complexities apparent in the natural world. Marr, Peacocke and Clark remove this 
possibility as well, however, by exposing the inability of the artificial-world modeler 
to achieve an algorithmic understanding of the system he chooses to model. In that 
sense an artificial world model can strive to be little more than a proof-of-concept, 
an indication of possible directions for empirical research but of very little use in 
deriving actual useful conclusions about the behaviour of interest. 


4.8 Saving Simulation: Finding a Place for Artificial Worlds 


4.8.1 Shifting the Tractability Ceiling 


Thus far our attempts to find a justification for models utilising artificial worlds have 
been rebuffed quite frequently; even the relatively promising framework for strong 
ALife outlined in Chap. 3 provides little insight into creating useful models of that 
type. Levins has criticised modellers for failing to balance generality, realism and 
precision in our models; our refinement of Levins’ three factors has demonstrated 
the dangers of constructing a balanced, but intractable model; and Marr, Peacocke 
and Clark have pointed out the failing of certain models to develop an algorithmic 
understanding of a natural system or behaviour. How then might our artificial worlds 
of roving software agents contribute to the realm of empirical science? 

Thankfully, agent-based simulations and similar methodologies do not exist 
in a vacuum. The tractability ceiling is a formidable obstacle, but not a static 
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Fig. 4.3 Breaking through 
the tractability ceiling 


Tractable 


one: as computing power increases, for example, our ability to develop and run 
increasingly complex simulations becomes ever more feasible. In addition the 
continual refinement of theoretical frameworks in empirical science, which drive 
the simplifying assumptions that underlie the creation of these artificial worlds, 
can completely alter our understanding of a particular problem, and thus alter our 
methods for developing an understanding of that problem. 

However, even with this shifting tractability ceiling (Fig.4.3), we are left with 
substantial theoretical difficulties. How can we avoid the seemingly insurmountable 
problems raised by Marr, Peacocke and Clark? With simulations easily falling into 
the trap of opacity, how can we still claim that our models remain a useful method 
for empirical science? 


4.8.2 Simulation as Hypothesis-Testing 


Agent-based models may serve as a much more fertile method of research when 
used as a well-designed adjunct to more conventional empirical methods. Like 
Noble, Di Paolo and Bullock’s ‘opaque thought experiments,’ simulations may 
function best as a method for testing the limitations of a hypothesis (Di Paolo et al. 
2000). Our agent-based model of bird migration could provide a way to examine 
how existing theories of migration might predict the future movements of the real 
species, and through its results produce some potential avenues for empirical testing 
of those predictions. 

The greatest problem lies, as we have seen, in using a simulation to produce 
useful empirical data on its own. While strong Alife theorists may choose to escape 
this problem by contending that their artificial worlds are in fact ‘real’ worlds full 
of their own digital biodiversity, more conventional approaches suffer under the 
constraints particular to the simulation methodology. Overly abstracted simulations 
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may not produce the desired resemblance to the natural systems, while overly 
complex simulations may actually preclude a sensible analysis of the resulting data. 

Through all of these various arguments regarding the simulation methodology, 
the difficulty of empirical data-production remains the most prominent. Beyond 
the trap of opacity lies another trap: simulation models can all too easily lend 
themselves to being studied simply for their own sake. As these models brush 
against the tractability ceiling, they must necessarily fall short of the complexity of 
the real system of interest; the model cannot fundamentally replace the real system 
as a target for experimentation. Studying a model simply to test the boundaries 
and behaviours of the attached artificial world (as in Ray’s Tierra, for example) 
may certainly lead to intriguing intellectual questions, but is unlikely to lead to 
substantive conclusions regarding the natural world. 


4.9 Summary and Conclusion 


Looking through the lens of population biology, we have seen the varied and 
disparate opinions dividing the simulation community. Levins’ early debates about 
the merits and pitfalls of mathematical modelling continue today in a new form 
as computational modelling continues to grow in prominence throughout various 
disciplines. 

The limitations outlined by Levins remain relevant precisely because they are 
linked to the fundamental shortcomings of any modelling endeavour, rather than 
being confined particularly to the realm of mathematical models. The three dimen- 
sions of generality, precision and realism provide a useful pragmatic framework 
under which the simulation researcher and theorist can examine the fundamental 
assumptions of a given model, regardless of the paradigm under which it was 
constructed. 

Our expansion of Levins’ framework provides closure to this list of theoretical 
concerns, showing how the issue of tractability confines computational modelling. 
While simulations are a remarkably attractive methodology for their relative 
simplicity and apparent explanatory power, these characteristics cannot overcome 
the simple problems of complexity and analysability which can make an otherwise 
appealing model incomprehensible. 

Even assuming tractability and the four dimensions of Levins’ updated frame- 
work are examined in detail during the construction of a simulation, Marr, Peacocke 
and Clark point out the difficulties inherent in deriving useful insight from such a 
model. The usual procedure for developing an understanding of a natural system’s 
behaviour is circumvented by the simulation process; the simulation produces 
results directly, without the intermediate step of producing an algorithmic under- 
standing of the behaviour of interest, as is the normal case in empirical research. 
This leaves the simulation researcher in a very difficult position as he attempts 
to work backward from his completed simulation results to find that algorithmic 
understanding. 
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From this chain of investigations the overall appeal of the simulation methodol- 
ogy and the production of these ‘artificial worlds’ becomes apparent. By creating an 
artificial world in which simulated agents interact and produce complex behaviour 
of their own accord, the researcher can evade the questions raised by Levins and 
plausibly escape the constraints pointed out by Marr, Peacocke and Clark. The 
artificial world can easily become a target of experimentation, providing as it does 
a distilled and simplified version of its real-world inspiration, without some of the 
attendant analytical difficulties. 

Of course, this methodology produces analytical difficulties of its own. This 
artificial world cannot replace the real world as a source of empirical data, as their 
complexity is far lower and their construction far more theory-dependent. As much 
as the simplicity and power of computer simulation exerts a powerful lure for the 
research community, the objections of Levins and others provide a powerful case 
for a great deal of caution in the deployment and use of such methods. 

As looking through the lens of population biology provided a useful set of 
criticisms and possible constraints for simulation modelers, so looking through the 
lens of the social sciences will provide a look at some promising areas for simulation 
research. We have already seen how simulation can be very appealing for fields 
which suffer from a dearth of empirical data, and the social sciences are in the unique 
position of providing a great variety of theoretical background in a field which is 
often very difficult to examine empirically. An examination of the use of simulation 
in the social sciences will provide a means to integrate these varying perspectives 
on simulation into a framework that accounts for the strengths and weaknesses we 
have exposed thus far. 
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Part II 
Modelling Social Systems 


The advent of Alife lead to a period of great excitement, as the power and flexibility 
of simulation suggested whole new ways to study the processes of biology. So 
too with social simulation, a relatively recent development in the social sciences 
in which simulation methods — most commonly agent-based models — are used 
to model social systems and their behaviours. The prospect of using simulation to 
model the emergence of population-level effects from individual social interactions 
at the agent level. 

In Part II we will investigate social simulation as a methodology, uncovering both 
the strengths and weaknesses of this approach for revealing the processes underlying 
human society and its evolution. We will examine the complex relationships 
between social simulations, real-world population data, and social theory. We will 
also expand the methodological analysis begun in Part I and discover how the 
modelling frameworks we studied may be applied to social simulation, and how 
they compare to similar frameworks developed by social modellers and theorists 
themselves. 

In order to examine these frameworks and their potential utility for social 
simulation, we will take inspiration from the early days of the field. The beginning 
of social simulation is frequently credited to Thomas Schelling and his residential 
segregation model, an elegantly simple investigation of how even minor preferences 
amongst individuals for living near to others similar to themselves can lead to 
residential segregation emerging at the population level. We will use Schelling’s 
model to compare the methodological frameworks uncovered through Parts I and I, 
and use the insights gained to propose a way forward for social simulation as a 
whole. 


Chapter 5 
Modelling for the Social Sciences 


Eric Silverman and John Bryden 


5.1 Overview 


The examination of the use of ‘artificial worlds’ in the previous chapter seemed 
to produce some rather damning concerns for those involved in agent-based 
simulation. While such models can provide a nicely compartmentalised and distilled 
view of a vastly more complicated real-world system, such a model can create 
tremendous difficulty when the researcher must begin to analyse the resultant data. 

In the context of our overall discussion, however, the examples and analysis 
presented so far have focused largely on models based upon or inspired by biology. 
Agent-based modelling is far from confined to this singular area of study, so this 
chapter will introduce a major theme of this text: the uses and limitations of agent- 
based models in the social sciences. 

As agent-based modelling spreads through various disciplines, there are certain 
applications which seem particularly promising due to an attendant lack of empirical 
data underwriting those disciplines. Social science seems an especially relevant 
example of this situation; by its very nature, the study of social systems makes 
data-gathering a difficult proposition. 

In such an instance, can agent-based models provide a means for developing 
social theories despite the lack of empirical data to validate the models themselves? 
This chapter will examine this question in detail, first by describing the current 
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state-of-the-art in simulation in the field, and second by examining the methodolog- 
ical and philosophical implications of applying simulation techniques to the social 
sciences. 

This chapter forms the first section of Part II of our analysis. The discussion in 
this chapter places the current state of social science simulation into the context of 
our analysis of modelling for Alife and the biological sciences. This allows us to 
develop a contrast between simulation approaches between these two fields, and in 
turn discover those points at which the unique problems of social science research 
impact upon the relevance of agent-based models to the social science researcher. 


5.2 Agent-Based Models in Political Science 


5.2.1 Simulation in Social Science: The Role of Models 


In the past, models within social sciences such as economics, archaeology and 
political science have focused on mathematical approaches, though as modelling 
techniques within the field progressed, some began to criticise this focus. Read 
(1990) observes that mathematical models within archaeology may have produced 
sets of modelling assumptions that do not produce useful insights into human 
behaviour. He posits that the transition of mathematical models to social science 
from traditional uses in physics and other natural sciences had actually restricted 
the progression of archaeology by focusing on these inappropriate assumptions. 

Similarly, the traditionally statistically-focused discipline of demography has 
begun to embrace new methodologies as the limitations of these methods for certain 
types of research questions has become evident (Billari and Prskawetz 2003). The 
advent of microsimulation in the demographic community has led to some in-depth 
examination of the foundations of demographic knowledge (e.g., Courgeau 2007), 
with some suggesting that agent-based modelling could form the foundations of a 
new, systems-based demography (Courgeau et al. 2017). The development of these 
modelling methods has been met with significant enthusiasm, though the marriage 
between data-focused demographic statistical modelling and abstract, individual- 
based modelling is an uneasy one. 

Looking at social sciences more broadly, McKelvey (2001) notes that an 
increasing number of social scientists and economists propose that the traditional 
perspective of dynamics in these areas focused upon changing equilibria are out- 
dated, and that the agent-based perspective of interacting autonomous social actors 
will provide greater insight into social behaviour. Indeed, Billari and Prskawetz 
(2003) argue that demography may only be able to surpass some of the challenges 
facing the study of population dynamics by embracing agent-based models. Hen- 
rickson and McKelvey (2002) even propose that an increased focus on agent-based 
methodologies could give social science greater legitimacy within the scientific 
community, allowing for a greater degree of experimentation and analysis across 
social science as a whole. 
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5.2.2 Axelrod’s Complexity of Cooperation 


Axelrod’s 1997 book “The Complexity of Cooperation” led the charge for this 
increasing number of social scientists looking towards agent-based models as 
a method for examining sociological structures. In the same year, Cederman’s 
“Emergent Actors in World Politics” provided an intriguing look at potential 
applications of such models to the political sciences. In the years that followed, 
the early mathematical models of political and social problems began to transfer 
to agent-based modelling methodologies: Epstein et al. (2001) used agent-based 
models to examine civil violence; Lustick (2002) used similar techniques to study 
theories of political identity in populations; Kollman et al. (1997) model the 
movements and shifting alliances of voters; and Schreiber (Donald et al. 1999) 
modelled the emergence of political parties within a population, to provide just a 
few examples. Thus, an increasing number of political scientists seem interested 
in modelling the emergence of social structures and institutions from the level of 
individuals or small sub-populations; however, the community remains divided as 
to the usefulness of such methodologies. 


5.3 Lars-Erik Cederman and Political Actors as Agents 


5.3.1 Emergent Actors in World Politics a Modelling Manifesto 


Cederman’s initial work in his 1997 book-length treatise focused on the simulation 
of inter-state interactions. Each model represented nation-states as agents, each 
with differing drives that altered their pattern of interaction with other states in 
the region. He presents these simulations as a means for building an aggregate- 
level understanding of the function of political structures by understanding the 
interactions of micro-level features which produce those effects (although, notably, 
he did not begin modelling these interactions using smaller units until a few years 
later, after the publication of Emergent Actors). 

The excitement apparent in the pages of Emergent Actors seemed infectious, 
bringing other social and political scientists into the fold with an increasing number 
of agent-based models finding publication in political science journals. Lustick 
(2000) proposed that agent-based models could alleviate some of the shortcomings 
of previous modelling approaches: 


Difficulties of amassing and manipulating collective identity data into theoretically potent 
comparisons are among the reasons that agent-based modelling can play an important role 
in the elaboration, refinement, and testing of the kind of specific and logically-connected 
theoretical claims that constructivists have been faulted for not producing. Because the 
models run on computers there is no room for ambiguity in the specification of the model’s 
underlying rules. 
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Kenneth Benoit went so far as to argue that ‘because simulations give the 
researchers ultimate control, simulations may be far better than experiments in 
addition to being cheaper, faster, and easier to replicate’ (Benoit 2001). In a 
discipline where solid field-collected data is both complex to analyse and expensive 
to collect (or even impossible depending on the related political conditions), the 
prospect of being able to generate useful data from low-level simulated interactions 
seemed quite promising. 


5.3.2 Criticism from the Political Science Community 


Criticism of agent-based models in political science has come from a number of 
different sources, but a large portion of those criticisms focus on the difficulty of 
making sensible abstractions for social and political structures within such a model. 
As described earlier in relation to Alife models, one can potentially view all of 
scientific inquiry as reflective of the inherent biases of the experimenter, and this 
problem of theory-dependence is even more acute in models requiring the level of 
abstraction that political models necessitate. 

Of course, even the most abstract of Alife models may reference both the 
real-life behaviour of natural biological systems and the wealth of knowledge 
obtained from many decades of observation and experimentation in evolutionary 
biology. Political science, however, does not have that luxury. Highly complex social 
structures and situations, such as Cederman’s models of nationalist insurgency 
(Cederman 2002, 2008) involve further layers of abstraction, including factors 
which do not immediately lend themselves to quantification such as cultural and 
national identities. 

Evolutionary Alife simulations also benefit from an extensive backdrop of both 
theoretical and empirical work on innumerable species, allowing for the basic 
functions of evolutionary dynamics within biological systems to be modeled fairly 
competently. In contrast, political systems involve multiple layers of interacting 
components, each of which is understood primarily as an abstracted entity; fre- 
quently only the end results of political change or transition are easily observable, 
and even then the observer will have great difficulty pinpointing specific low-level 
effects or drives which may have influenced those results. 

As Kliiver et al. (2003) describe, sociological theory may not benefit from the 
micro/macro distinction of levels of analysis that benefits researchers of evolution 
and other large-scale processes. A micro/macro distinction allows the simulation 
researcher to create a hierarchical relation between elements, making for simpler 
analysis. The interacting social levels present in a political system however cannot 
be so clearly differentiated into a hierarchy of processes, making simulation a 
difficult, and highly theory-dependent, exercise. Due to these elements, theory- 
dependence in social simulation becomes a more acute problem than in ALife; 
Sects.5.6 and 5.7 examine this difficulty in detail, and propose some possible 
solutions for the social simulation community based upon the systems sociology 
approach of Luhmann (1995). 
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5.3.3 Areas of Contention: The Lack of ‘Real’ Data 


Donald Sylvan’s review of Cederman’s Emergent Actors in World Politics high- 
lights another common complaint leveled at agent-based models by conventional 
political science: 
Moreover, ‘data,’ in the way this term is understood by most statistically-based modelling 
procedures, is largely absent from Emergent Actors. This feature is very much in line with 
the rational- choice modelling tradition, of which the author is so critical. Many readers 
will find nothing problematic about this feature in a theoretical work such as this. However, 
it is important that readers understand that the lack of ‘data’ is a standard feature of CAS 
simulation as they evaluate the ‘results’ reported. (Sylvan 1998, p. 378) 


As Sylvan points out, Cederman’s ‘data’ only relates to the interactions of virtual 
states in an idealised grid-world; applying such data to real-life political events or 
transitions seems suspect. The level of complexity at work in large-scale political 
events may be very difficult to capture in an agent-based model, and knowing when 
to draw a specific conclusion from a model of such an inherently difficult-to-analyse 
situation is quite difficult. 


5.4 Cederman’s Model Types: Examples and Analysis 


Despite these objections, Cederman, as evidenced by his extensive book-length 
work on agent-based modelling and subsequent methodological papers, sees agent- 
based modelling as a promising means of investigation for political scientists 
(Cederman 1997, 2001). His attempt to present a framework to describe the various 
potential goals of models in this discipline provides an opportunity to contrast this 
proposed social simulation approach with the other modelling frameworks analysed 
thus far. 


5.4.1 Type 1: Behavioural Aspects of Social Systems 


Cederman, in describing his three-part categorisation of social simulation (see 
Table 5.1), credits Axelrod’s early work on the iterated prisoner’s dilemma as the 
first major foray into modelling behavioural aspects of social systems (Cederman 
2001; Axelrod and Hamilton 1981; Axelrod 1984). Axelrod’s work aimed to show 
the emergence of cooperation, and with the iterated prisoner’s dilemma showed 
that cooperation is possible in social settings as long as the interactions of involved 
agents are iterated. 

This variety of work has continued in the years since, beginning with modifi- 
cations to Axelrod’s original model, such as spatially-embedded versions which 
show the emergence of cooperative clusters in a social system (Lomborg 1996). 
By incorporating more complex elements into the original models, researchers 
have attempted to draw further conclusions about the evolution of cooperative 
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Table 5.1 Summary of Cederman’s three modelling types 


Cederman’s model classification 


Cl Focus on behavioural elements of social systems 
C2 Focus on emergence of agent configurations 
C3 Focus on emergence of interaction networks between agents 


behaviours; such a focus on these aspects of a social system is characteristic of a 
Type 1 model (hereafter referred to as C1) under Cederman’s classification. 

A major benefit of this type of model is computational simplicity. The prisoner’s 
dilemma example noted here is a well-known problem in game theory, and has 
been implemented and studied countless times over the past few decades. For the 
modeller, reproducing such a game computationally is a relatively simple task 
compared to more complex models, due to the lack of excessive numbers of 
parameters and the inherent compartmentalised nature of the interactions between 
players of the game. 

To use our bird migration example, imagine that a certain bird species has 
demonstrated a social behaviour which can produce greater nest productivity 
between cooperating females, but at the expense of more frequent reproduction. 
In this case a C1 model might be useful, as a game-theoretic model could be 
designed to probe the ramifications of this behaviour in different situations. While 
our researcher would be pleased in one sense, given the greater simplicity of model- 
construction in this case, the model would also be quite narrow in its focus, and the 
abstractions made in such a model would be significant. 


5.4.2 Type 2: Emerging Configurations 


Cederman identifies Type 2 models (C2) as those which attempt to explain the 
emergence of particular configurations in a model due to properties of the agents 
(or ‘actors,’ to use Cederman’s terminology) involved (Cederman 2001). Models of 
cultural evolution, fit this description, as they rely upon the interaction and exchange 
of agent properties (often identified as ‘arguments or ‘attitudes’) and examine the 
resultant configurations of agents (March 1991; Axelrod 2001). 

Ian Lustick’s Agent-Based Identity Repertoire Model (ABIR) is a suitable exam- 
ple of a modern C2 model, as it provides agents with potential ‘identities’ relating 
to different groups of agents which can be modified through interactions between 
those agents (Lustick 2006). C2 models such as ABIR focus on demonstrating the 
emergence of larger configurations within the social systems they simulate; in this 
case, the properties of each agent in the ABIR model have been used to study 
the development of clusters of ethnic and religious groups under various social 
situations. 

These C2 models offer greater complexity for the modeller than C1 models, but 
they also offer the possibility of examining another category of questions about 
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social systems. To use our bird migration example, imagine that our researcher 
wished to examine the interactions of members of a flock upon arrival at a 
destination and reaching a suitable colony site. A C2 model may be a useful 
approach in this context, as our researcher could construct a model which assigns 
each agent certain properties (such as gender, behaviour mode, and so on) which 
could then allow the agents to interact using these properties and delegate roles and 
responsibilities in establishing a colony. 

Of course, this model is far more complex than the C1 example above, but the 
problem in question is also very different. Not all problems are easily broken down 
into variations or extensions of highly-studied game-theoretic situations, so in this 
case our bird researcher may prefer to construct a novel model in the C2 style which 
suits this problem, despite the greater difficulties inherent in doing so. 


5.4.3 Type 3: Interaction Networks 


Cederman classifies Type 3 models (C3) as perhaps being the most ambitious: 
this type of system attempts to model both the individual agents themselves and 
their interaction networks as emergent features of the simulation (Cederman 2001). 
Interestingly, Cederman cites the field of artificial life as one likely to inform this 
area of computational work in political science, given that Alife focuses on such 
emergent features. He also acknowledges that some overlap can occur between C1 
and C3 models by allowing agents more latitude in choosing interaction partners for 
example (Cederman 2001; Axelrod 2000). 

Cederman argues that C3 models may provide very powerful tools for the 
political scientist, allowing for profound conclusions to be drawn regarding the 
development of political institutions. This approach does seem the most method- 
ologically difficult of the three types in this classification, however, as the already 
significant abstractions necessary to create C1 and C2 models must be relaxed even 
further to allow for such ambitious examinations of emergent features at multiple 
levels. 

To once again use our bird migration example, we could imagine any number of 
possible ALife-type models which would fall under the C3 categorisation. Our bird 
researcher would encounter the same difficulties with these models that we have 
described in previous chapters. The C3 classification as provided by Cederman is 
quite broad indeed — presumably due to the relatively recent appearance of this type 
of model within political science. 


5.4.4 Overlap in Cederman’s Categories 


As mentioned above, Cederman acknowledges that there is some overlap between 
his C1 and C3 categories (Cederman 2001). However, given the complex nature 
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of social interaction, none of his categories provide a hard distinction that makes 
categorisation of simulations obvious in every case. 

Cederman points to the possibility of a C1 model straying into C3 territory by 
simply allowing its agents more greater choice in choosing interaction partners. In a 
sense, C3 seems to be a superset of C1 and C2; a C3 model could provide insight into 
similar issues to those examined by a C1 or C2 model. Given that the C3 approach 
is naturally more broad, then the border separating either of the other types from C3 
becomes more fuzzy. 

Thus, the utility of Cederman’s categories is slightly different from the pragmatic 
nature of Levins’ modelling dimensions (Levins 1966). Defining the position of 
a model along Levins’ dimensions is difficult due to the problems inherent in 
specifying the exact meaning of generality, realism and precision, but the framework 
as a whole remains useful as a pragmatic guideline for modellers (see Chap. 4 
for further discussion). Cederman’s framework is not intended to serve this same 
purpose, but does provide a means to classify and discuss models in terms of social 
science research questions. For this reason, Cederman’s framework will be useful 
to us as we investigate modelling in the social sciences in greater detail in the 
remainder of the text. 


5.5 Methodological Peculiarities of the Political Sciences 


5.5.1 A Lack of Data: Relating Results to the Real World 


As Sylvan’s review of Cederman emphasizes, Cederman’s ‘data’ only relates to the 
interactions of virtual states in an idealised grid-world; applying such data to real- 
life political events or transitions seems suspect at best. The levels of complexity 
at work in large-scale political events may be very difficult to capture in an agent- 
based model, and knowing when to draw a specific conclusion from a model of such 
an inherently difficult-to-analyse situation is quite difficult. 

In essence the lack of real ‘data’ produced by such simulations is an issue critical 
to the acceptance of such models in mainstream political science. While some 
accept the potential for social simulations to illuminate the emergence of certain 
properties of political structures (Epstein et al. 2001; Axelrod 2001), the difficulty in 
connecting these abstracted simulations to real-world political systems is significant. 
Weidmann and Gerardin, with their GROWLab simulation toolkit, have attempted 
to sidestep these concerns by making their framework compatible with GIS 
(geographic information system) data in order to allow ‘calibration with empirical 
facts to reach an appropriate level of realism’ (Weidmann and Girardin 2006). 
They also emphasize the relational and spatially-embedded aspects of GROWLab 
simulations, presumably a nod to the importance of spatial considerations and social 
interactions in a real-world political context. 
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5.5.2 A Lack of Hierarchy: Interdependence of Levels 
of Analysis 


While even abstract Alife models may reference the real-life behaviour of natural 
biological systems, and the wealth of related empirical data, political models do 
not necessarily have that luxury. Highly complex social structures and situations, 
such as Cederman’s models of nationalist insurgency and civil war (Cederman 
and Girardin 2005; Cederman 2008) involve further layers of abstraction, often 
involving factors which do not immediately lend themselves to quantification, such 
as cultural and national identities. 

In addition, sociological theory is notoriously difficult to formalise, incorporating 
as it does a number of both higher- and lower-level cognitive and behavioural 
interactions. In fact, sociological theory may not benefit from the micro/macro 
distinction of levels of analysis that benefits researchers of evolution and other large- 
scale processes (Kliiver et al. 2003). These interacting social levels cannot be clearly 
differentiated into a hierarchy of processes, making simulation a very difficult, and 
highly theory-dependent, exercise. 


5.5.3 A Lack of Clarity: Problematic Theories 


Doran (2000) identifies a number of problems facing social scientists who wish 
to ‘validate’ their simulation work. He maintains that social scientists need not 
provide ‘specific validation,’ a direct connection to a target system, but instead face 
amore nebulous difficulty in demonstrating relevance of the assumptions within that 
simulation to social systems at large. He notes the immediate difficulties of finding 
an appropriate parameter space and method for searching that space, of analysing 
the simulation results in a way that does not produce an ‘intractable level of detail,’ 
and the problem of instability in simulations and the necessity of detailed sensitivity 
analyses. In his conclusion he argues convincingly for sensible constraints in social 
simulations which do not add confounding cultural biases to the behaviour of agents 
within the simulation. While Doran’s examples provide a useful illustration of this 
concept, simulation architects may find great difficulty in ensuring that such biases 
are absent from their work, particularly in more complex multi-agent simulations. 


5.6 In Search of a Fundamental Theory of Society 


5.6.1 The Need for a Fundamental Theory 


As we have seen, social science presents a few particularly thorny methodological 
problems for the social simulator. Despite this, can social simulation be used to 


94 5 Modelling for the Social Sciences 


illuminate the underlying factors which lead to the development and evolution of 
human society? Social simulation certainly presents a new approach for examining 
societal structures, and perhaps could serve as a novel method for testing hypotheses 
about the very origins of society itself. 

However, using social simulation for this purpose seems fraught with difficulties. 
Human society is an incredibly complex system, consisting as it does of billions 
of individuals, each making hundreds of individual decisions and participating in 
numerous interactions every day. There are vast numbers of factors at play and many 
of them are inherently unanalysable given that we cannot examine the contents of 
a human’s brain during a decision or interaction which appears to make the already 
monumental task of the social simulator nearly impossible in such a case. 

The problem of how life has evolved on Earth would seem also to be one of 
insurmountable complexity as well if it weren’t for the theories of Charles Darwin 
(1859). Perhaps there is hope for a theory in social science of similar explanatory 
power to evolution, not necessarily one that fully explains society, but instead 
provides us with a holistic framework to push forward our understanding of society 
— in a similar way that evolution does for biology. 


5.6.2 Modelling the Fundamentals 


While Cederman describes a broad framework in which social simulation can 
operate, a fundamental social theory seems difficult to develop under his description 
of C1, C2, and C3 models (Cederman 2001). C3 models are designed to allow for 
the development of broad-stroke models which can illuminate more fundamental 
theories about political systems; however, to extend that to societal structures and 
interactions as a whole requires a new level of abstraction. 

In essence an extension of the C3 categorisation becomes necessary when 
seeking a fundamental theory. In the context of political science, agents would 
be operating under a framework developed from that particular field of social 
science; while political decisions amongst agents will by their nature require 
the incorporation of elements of psychology and sociology, a more fundamental 
approach requires an even more abstract method for allowing all varieties of social 
behaviour to emerge from the simulated system. 

As part of this new approach, we also need a new perspective on the development 
of human society. How do individual actors grow to communicate? How does that 
communication then become structured? How do these structured communications 
then grow into a societal-level framework guiding interactions between members of 
a population? To see the fundamentals of human society develop, the model would 
need to set a stage upon which society may grow without a pre-set communicative 
framework already in place. 
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5.7 Systems Sociology: A New Approach for Social 
Simulation? 


As we have seen, the advent of social simulation has proved influential in the 
social sciences, provoking new questions regarding the origin and nature of society. 
While the examples discussed thus far demonstrate the potential impact of social 
simulation, they also illustrate the inherent difficulties involved in generalising the 
conclusions drawn from a social simulation. More generalised models of society 
may provide a means for investigating aspects of society which elude the empirical 
data-collector and in turn inform our search for a fundamental social theory, but in 
order for this to occur we need to establish a method of examining society on a 
broad theoretical scale through simulation. 


5.7.1 Niklas Luhmann and Social Systems 


The well-known social systems theory of Niklas Luhmann provides one example 
of an attempt to develop an understanding of the foundations for social behavior. 
Luhmann classifies social systems as systems of communication which attempt to 
reduce complexity by presenting only a fraction of the total available information 
(Luhmann 1995). 

One of the fundamental issues facing the systems sociology theorist is solving 
the problem of double contingency, an issue Luhmann describes as central to the 
development of social order. Put simply, if two entities meet, how do they decide 
how to behave without a pre-existing social order to govern their actions? How 
might these entities decide to develop a common means of interaction, and through 
those interactions develop a shared social history? 

As Dittrich, Kron and Banzhaf describe, Luhmann described a method for 
resolving this contingency problem which was far more elemental than previous 
approaches, relying as it does on ‘self-organization processes in the dimension of 
time’ rather than through more standard social processes. The entities in question 
would perform initial contingency-reducing actions during an encounter to allow for 
each to develop an understanding of the expectations of each party in the interaction 
(Dittrich et al. 2003). 

In Luhman’s view, the social order develops as a consequence of these 
contingency-reducing actions on a large scale. As elements of the developing 
society develop their expectations about the social expectations of others (described 
as ‘expectation-expectation’ by Luhmann), a system of social interaction develops 
around this mutual social history. This system then produces as a consequence the 
social institutions which can further influence the development of the social order. 
These social institutions perform a similar function by reducing the amount of 
information disseminated amongst the members of a society, essentially providing 
contingency-reducing services on a much larger scale. 
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Agent-based models in the context of artificial life have certainly proved useful in 
the examination of other autopoietic systems; however, recent attempts to formalize 
Luhman’s theories into a usable model, while producing interesting results, have 
highlighted the inherent difficulties of encapsulating the many disparate elements of 
Luhman’s theories of social systems into a single model (Fleischmann 2005). 


5.7.2 Systems Sociology vs. Social Simulation 


As we can see from Luhman’s analysis, while there may indeed be a lack of ‘data’ 
inherent to the study of artificial societies, there still exists a theoretical framework 
for understanding the fundamental mechanisms which drive the creation of a larger 
social order. While some social simulation researchers may seek to strengthen their 
models through establishing direct connections with empirically-collected data from 
social science, the systems sociology perspective could provide a different path to 
more useful examinations of human society. 

The social simulation stream is oriented towards specific elements of social 
behaviour; simulations of cooperation (Axelrod 1997), nationalist insurgency (Ced- 
erman 1997), or the spatial patterning of individuals or opinions within a society 
(Lustick 2006). Social simulation’s stronger links with empirical data may make 
validation of such models much easer, but further restricts the domain of those mod- 
els to focus on social problems for which usable data exists. Given the difficulties 
inherent in collecting social science data, these problems tend to be a subset of those 
social problems for which models could prove potentially illuminating. 

This very restriction into particular domains prevents the social simulation 
approach from reaching a more general perspective; this approach is constrained 
by approaching social phenomena from the top-down. These top-down approaches 
are necessarily rooted in the societies they model. In essence, looking for a feature 
in society and then attempting to reproduce it in a model is not sufficient to develop 
a fundamental theory. 

In contrast, the systems sociology stream abstracts outside of the standard view 
of society. Luhmann’s perspective aims to describe interactions which can lead 
to the development of social order, in a sense examining the development of 
human society through an ‘outside perspective.’ Luhmann essentially moves beyond 
standard sociology, attempting to describe what occurs prior to the existence of 
social order, rather than operating within those bounds as with social simulation. 

Returning for a moment to our bird migration example, imagine that our 
migration researcher wishes to construct a model to investigate the beginnings of 
migration behaviour. One obvious approach may be to model individual agents, 
each of which is given the choice of moving to follow changing resources or 
environmental conditions. However, in a Luhmannian context, we could remove 
that element of pre-existing ideas concerning the origins of migration. A model 
which features individual agents which can move of their own accord, and have 
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basic requirements for survival in the simulated environment, may provide differing 
explanations of migration behaviour if that behaviour is seen to emerge from this 
very basic scenario. In this way, we allow for the possibility of other means for 
migration to emerge: perhaps through a developing social structure which drives 
movement of groups of birds, for example. In essence we would seek to move the 
migration model to a stage in which we assume as little as possible about the origins 
of migration behaviour, rather than assuming that certain factors will produce that 
effect. 

Similarly, by viewing society from its earliest beginnings prior to the existence 
of any societally-defined modes of interaction and communication, the systems 
sociology approach hopes to develop a theoretical understanding of the fundamental 
behavioural characteristics which lead to the formation of social order. In many 
ways this approach is reminiscent of the Alife approach to modelling ‘life-as- 
it-could-be’ (Langton et al. 1989); the systems sociology perspective leads us to 
examine society-as-it-could-be. 


5.8 Promises and Pitfalls of the Systems Sociology Approach 


5.8.1 Digital Societies? 


Having established this relatively promising outlook on the future prospects of 
social-science simulation using Luhmann’s approach, a certain resemblance to the 
early philosophy of artificial life becomes apparent. As in Alife, we may have simply 
replaced one troubling set of methodological and philosophical concerns with 
another. Strong Alife’s contention that computer simulations can be repositories 
for real, digital life provides an escape route for theorists to develop a suitable 
theoretical backstory for Alife. As discussed in Chap.3, such a backstory can 
underwrite these computer simulations as a new method for gathering empirical 
data, a means for examining processes like evolution in a method that is otherwise 
completely impractical. As long as we maintain that a computer simulation can 
potentially produce life, then our experiments on that digital biosphere can proceed 
apace. However, such a backstory for this ‘artificial society’ approach to social 
science seems a great deal more tenuous. Potentially, we could harken back to 
Silverman and Bullock’s Physical Symbol System Hypothesis for Alife (Silverman 
and Bullock 2004): 


1. An information ecology provides the necessary and sufficient conditions for life. 
2. A suitably-programmed computer is an example of an information ecology. 


Then, if we further argue that society is a property of living beings, we may 
contend that such an information ecology would also provide the necessary and 
sufficient conditions for the development of a society. 
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5.8.2 Rejecting the PSS Hypothesis for Society 


Ignoring for a moment the philosophically and ethically troubling nature of the 
potential theoretical backstory outlined above, those who might find such an 
account appealing will be forced once again to face the artificial-world problem. 
Additionally, the vastly increased complexity of a population of organisms capable 
of developing a society would also increase the troubling aspects of this artificial- 
world approach, creating ever more complex artificial societies that are increasingly 
removed from real-world societies. 

As discussed in Chap. 4, the greatest difficulty with developing an artificial world 
in which to study such complex systems is the problem of connecting that artificial 
world to the natural one on which it is based. The Strong Alife community may 
argue that probing the boundaries of what constitutes life in a virtual world is 
inherently a valuable pursuit, allowing for the creation of a new field of digital 
biology. 

For the social scientist, however, the possibility of creating a further field of 
‘digital sociology’ is less than appealing. In a field where empirical data in relation 
to the natural world is far more lacking than in biology, and in which simulation 
seems to be viewed as a means for enhancing the ability of social scientists to 
produce and test sensible theories, then producing and testing those theories in 
relation to a virtual society without direct connection to real society is quite a 
wasteful pursuit. Indeed, Burch (2002) contends that computer simulations in social 
science will revolutionise the field by embracing this theoretical complexity and 
tying it directly to empirically-relevant questions. 

With the appeal of Luhmann’s approach deriving from the potential for examin- 
ing the earliest roots of societal development, and from that developing a fundamen- 
tal theory of society analogous to evolution in biology, a theoretical backstory along 
the lines of strong Alife seems inappropriate. Instead, the Luhmann-influenced 
social simulator would strive for a theoretical framework which emphasizes the 
potential role for simulation as a means for social explanation and theory-building, 
rather than allowing for the creation of digital forms of society. 


5.9 Social Explanation and Social Simulation 


The problem of explanation in social science, as in most scientific endeavours, is 
a difficult one. In a field with such various methods of data-gathering, prediction, 
and theory-construction, developing a method for providing social explanation is no 
mean feat. 

Proponents of social simulation regard the agent-based simulation methodology 
as one potential method for providing an explanation of social phenomena. Before 
we establish the veracity of this opinion, however, we must establish the ground 
tules for our desired form of social explanation. With social systems involving 
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potentially many millions of participants, we must determine how we will focus 
our social explanations to derive the greatest possible theoretical understanding of 
the processes underlying human society. 


5.9.1 Sawyer’s Analysis of Social Explanation 


As noted by R. Keith Sawyer, ‘causal mechanistic accounts of scientific explanation 
can be epistemically demanding’ (Sawyer 2004). A causal mechanistic explanation 
of a social system would require a detailed analysis of the large-scale elements of the 
system, and their related elements, but also of the individual actions and interactions 
of every member of that society. 

Of course, this explanation may still be insufficient. On a macroscopic scale, 
the behaviour of a human society may be identical despite significant variations 
in the microscopic actions of individual members of that society. In that case, the 
causal mechanist account fails to encompass the larger-scale elements of a social 
explanation which could describe these effects. 

Oddly enough, this description echoes that of many current agent-based models. 
Most of the models discussed thus far in this chapter have displayed a clear 
mechanistic bent; agents interact in ways reminiscent of societal actors and produce 
complex dynamical behaviour, but there is little to no incorporation of larger-scale 
structures such as social institutions. Therefore such models are not only causal 
mechanistic accounts, but they are also methodologically individualist (Sawyer 
2004; Conte et al. 2001). 

With this in mind, Sawyer implies that the current state-of-play in social 
simulation is incapable of providing true social explanation. He states that ‘an 
accurate simulation of a social system that contains multiply-realised macro-social 
properties would have to represent not only individuals in interaction, but also these 
higher-level system properties and entities’ (Sawyer 2003, 2004). 


5.9.2 Non-reductive Individualism 


Revisiting Kluver and Stoica for a moment, we recall their concerns regarding 
agent-based models and the difficulty of capturing the multi-leveled complexity of 
social phenomena within such a structure (Kliiver et al. 2003). They argue that social 
phenomena do not adhere to a strictly hierarchical structure in which micro-scale 
properties result in macro-scale behaviour; in fact, these levels are intertwined, and 
separating social systems into that sort of structure as is common in agent-based 
models may be extremely difficult. 

Sawyer’s take on social explanation and social simulation expands on this topic, 
describing the difficulty of applying the concept of emergence (familiar to us from 
artificial life) to the social sciences. While the idea that lower-level simplicity leads 
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to higher-level complexity seems intuitively appealing within Alife and other related 
fields, Sawyer presents the idea that in fact social systems do not necessarily obey 
this relation (Sawyer 2003, 2004). 

In this view, there is a fundamental conflict between emergence and the social 
sciences. If social systems can display properties are ‘irreducibly complex,’ then 
those properties cannot be the result of individual actions or properties, and thus 
could not have emerged from those actions or properties. This is clearly quite a 
dangerous possibility for the social simulator, as then the causal mechanistic method 
of simulating low-level agent interaction to produce high-level complexity would be 
a fundamentally flawed approach. 

In order to escape this potentially troubling theoretical conflict, Sawyer proposes 
his own version of emergence for the social sciences which he dubs non-reductive 
individualism (see Sawyer 2002, 2003, 2004). In this view, Sawyer concedes to 
individualists that their fundamental assumptions about the roles of individuals in 
society are correct (i.e., that all social groups are composed of individuals, and 
those groups cannot exist without the participation of individuals). However, he also 
contends that some social properties are not inherently reducible to the properties 
of individuals; in this case, there is reason to present new ideas and theories which 
treat the properties of social groups or collectives as a separate entity. 

Returning to our bird example, imagine a model which addresses the social 
behaviour of a bird species, perhaps the development of birdsong for use in 
signalling between individuals or something similar. We could choose to model 
these social developments using an agent-based model, which Sawyer would not 
find objectionable; after all, the actions of individuals do drive society in Sawyer’s 
perspective as much as for any other social scientist. We then choose to model the 
development of these birdsong behaviours by allowing these agents to signal one 
another before performing an action, then observing if those signals begin to find 
use amongst the simulated population in different circumstances. 

However, Sawyer might argue that our model would be insufficient to ever 
display the richness inherent in bird social behaviour; the development and spread 
of new songs, the separation of songs into differing contexts among different social 
groupings and other such factors may be difficult to capture in a simple agent- 
based model. As with human society, he might argue that the complexity and 
variation of birdsong development requires another layer of simulation beyond 
the simple agent-based interactions underlying the drive to communicate between 
agents. Perhaps vindicating this perspective, some modelling work has shown that 
birdsong grammars and their development can be represented as evolving finite-state 
automata (Sasahara and Ikegami 2004). 

Sawyer names this perspective quite appropriately, drawing as it does upon 
the philosophy of mind perspective of non-reductive materialism. Non-reductive 
materialists accept the primacy of matter in driving the brain, and thus mental 
phenomena, but reject the idea that only low-level discourse regarding this brain 
matter is valid in the context of studying the mind. In other words, non-reductive 
materialists are not dualists, but argue that mental phenomena are worthy of study 
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despite the primacy of matter; or, in Sawyer’s words, ‘the science of mind is 
autonomous from the science of neurons’ (Sawyer 2002). 

Thus, in the case of social science, Sawyer essentially argues that the science of 
society is autonomous from the science of individuals. This leads to his contention 
that individualist agent-based models are insufficient to provide social explanation. 
Without incorporating both individual effects and irreducible societal effects, the 
model would not provide the complete picture of societal complexity, as described 
in our example above. 

Interestingly, Sawyer does leave one potential door open for individualist 
modellers. As he admits, one cannot be certain whether a given social property 
can be given an individualist mechanistic explanation, or whether that property will 
be proven irreducible to such explanations (Sawyer 2004); presumably, individual- 
based models could be used to fill that gap. A suitably rich model of interacting 
individuals could possibly provide a testing ground to determine whether a certain 
system does display properties independent of individual properties. However, 
under this view those simulations would only be able to provide explanation in such 
limited cases, and given that not all social systems will display such reducibility to 
individual properties, simulating every possible social construct to find such systems 
is presumably a rather inefficient way to utilise agent-based models in social science. 


5.9.3 Macy and Miller’s View of Explanation 


While Sawyer points out some potentially troubling methodological difficulties for 
social simulators, and also proposes entirely new simulation methods to circumvent 
those difficulties, he does still maintain that social simulation provides a means 
for social explanation when implemented appropriately (Sawyer 2004). Macy and 
Miller also argue that simulation provides a remarkably useful tool for social 
science, and lament the lack of enthusiasm for agent-based models within the social 
science community (Macy and Willer 2002). 

Macy and Miller propose that agent-based models can provide a perspec- 
tive particularly well-suited to sociology, arguing that this methodology ‘bridges 
Schumpeter’s (1909) methodological individualism and Durkheim’s rules of a non- 
reductionist method’ (Macy and Willer 2002, p. 7). Thus, agent-based models can 
produce groups of agents which produce novel and complex higher-level behaviour, 
and in this respect reflect a potentially appealing method of investigation for the 
social scientist seeking to understand the origin of certain societal properties. 

However, Macy and Miller join Sawyer in his caution regarding the application of 
such bottom-up methods to all social phenomena. Stepping away from the excite- 
ment of the artificial life theorists, they admit that these methods are not always 
inherently useful. In fact, they constrain the application of individualist models 
to ‘studying processes that lack central coordination, including the emergence of 
institutions that, once established, impose order from the top down’ (Macy and 
Willer 2002, p. 8). 
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5.9.4 Alife and Strong Emergence 


Sawyer and Macy and Miller’s points are more than reminiscent of the debate 
over strong emergence discussed in Chap. 2. In the Alife context, strong emergence 
contends that emergent phenomenon can show downward causation, influencing the 
behaviour of its own components. Just as Sawyer discusses, this would result in a 
situation in which that strongly emergent behaviour cannot be reduced to the actions 
of those component parts (O’Conner 1994; Nagel 1961). 

Bedau attempted to get around this restriction by proposing weak emergence, in 
which the macro-components of a system, allowed to run in simulation, can demon- 
strate the general properties of a weakly-emergent phenomenon (Bedau 1997). As 
noted in the previous discussion, however, Bedau’s categorisation requires a certain 
circularity of reasoning; Bedau himself states that only empirical observations at the 
macro-level of a given system can allow us to develop a means to investigate them 
through simulation. 

In essence, Bedau alters the problem slightly by contending that a simulation can 
provide a general understanding of the properties of a system’s macro-behaviour, but 
Sawyer would argue that such an explanation is still incomplete. Bedau’s method, 
after all, does not actually propose a means for the simulation researcher to avoid 
the difficulties caused by downward causation posited by the strong emergence 
theorists. Macy and Miller, despite being more positive about simulation than 
Sawyer, argue that this very difficulty fundamentally limits the utility of simulation 
of this type. Without a means for capturing this downward causative influence 
of macro-level social institutions and similar structures, they contend that more 
traditional empirical methods would remain more useful in some cases. 

Thus, for the simulation researcher who wishes to illuminate some potential 
influencing low-level factors in a given social system, the issues of non-reductive 
individualism or strong emergence do not make an enormous difference. Even if 
such objections are true, the researcher can still produce results which are indicative 
of the importance of those low-level factors in the emergence of the high-level 
behaviours under investigation, particularly with the assistance of an appropriate 
theoretical backstory. However, for the researcher wishing to use simulation as an 
explanation, and thus as a means for generating more powerful social theory, such 
objections create more difficulty. 


5.9.5 Synthesis 


Macy and Miller go on to identify two main streams in social simulation: studying 
the self-organisation of social structure and studying the emergence of social order 
(Macy and Willer 2002). The second stream is quite relevant to our earlier discussion 
of the implications of Luhmann’s theories regarding social order to the agent-based 
modelling methodology. 


5.10 Summary and Conclusion 103 


As described in our discussion of Luhmann, a central difficulty in social simu- 
lation is the issue of heavily-constrained agent interaction. While agent interactions 
can provide an explanation of social behaviours at the individual level, those 
interactions may be far too limited to provide a useful explanation of larger-scale 
social structures. Sawyer and Macy and Miller provide an affirmation of this idea, 
arguing that most agent-based models are inherently individualist and thus limited 
in their ability to explain many social structures. 

As we describe, the theories of Luhmann provide an intriguing means for 
studying the emergence of social order at the most fundamental level. However, 
while these models might provide insight into the earliest beginnings of certain 
societal interactions and structures, if we believe Sawyer and Macy and Miller then 
we would still be lacking critical elements of a complete social explanation. 


5.10 Summary and Conclusion 


Having taken a tour through the issues of social explanation that bear upon our 
proposed uses of agent-based models in the social sciences, we have a more 
complete picture of the possible empirical niche that such models may fill within this 
field. In particular our look at the explanatory deficiencies inherent to agent-based 
models draws us toward some specific conclusions regarding their most promising 
uses. 

First, as indicated by our analysis of Luhmann’s theories, we see that agent- 
based models suffer from some inherent constraints due to their status as artefacts 
of a society themselves. Given that models constructed based upon our own 
understanding of societal structure to date will naturally have certain fundamental 
assumptions about the operating parameters of a society, using such models to draw 
conclusions about a possible fundamental theory of society is fraught with potential 
difficulties. 

In addition, most agent-based models seen thus far in this analysis have been indi- 
vidualist constructions which seek a mechanistic explanation for societal properties. 
As Sawyer and others have shown, individualist models may lack another essential 
portion of information needed to produce a full social explanation. If we accept the 
non-reductive individualist contention that some social groups or collectives may 
have non-trivial behaviour that cannot be reduced to the actions of individuals, then 
we would be unable to model such social constructions using conventional agent- 
based modelling techniques. 

Thus, our analysis points toward a synthesis between Luhmann-style modelling 
of fundamentals combined with top-down elements. However, these elements seem 
rather disparate. Is it possible to combine models of the earliest beginnings of social 
interaction together with the influence of established top-down social structures? 
Does not the Luhmann view by its very nature preclude the inclusion of such preset 
structures, filled as they are by tacit assumptions regarding the functioning of society 
and its structures? 
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Clearly these two views would not mesh particularly well within a single model, 
but a combination of these two approaches when looking at different aspects of soci- 
ety may contribute to the development of the fundamental theory of human society 
that we seek. The next chapter will examine the current and future state of social 
simulation in relation to the theoretical frameworks elucidated in our analysis thus 
far. These comparisons will give us a more nuanced view of the issues facing agent- 
based modelling, with the addition of the social science perspective providing some 
new considerations. With these elements in mind, and with a view toward the issues 
of social explanation discussed in this chapter, we shall begin a more complete 
synthesis of theoretical frameworks that may drive future work in social simulation. 
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Chapter 6 
Analysis: Frameworks and Theories for Social 
Simulation 


6.1 Overview 


The previous chapter focused upon the current methodologies and theoretical 
implications of agent-based modelling in the social sciences. While many within 
this growing field accept that agent-based models provide a potentially powerful 
new method for examining social behaviours and structures, a great debate still 
continues over the best methods for utilising the strengths of this methodology. 

As seen in Part I, such debates are not unique to social simulation. Indeed, 
artificial life and more conventional forms of biological modelling have faced 
similar challenges over the past few decades. With this in mind, this chapter begins 
by placing artificial life within the various theoretical frameworks discussed thus 
far in Chaps. 3, 4 and 5. In this way the limitations of each framework can be 
illuminated. 

Social science simulation using agent-based models shares a number of con- 
straints and methodological difficulties with biological modelling using the same 
methodology. Thus, having placed artificial life and biological models within a the- 
oretical framework, social simulation will be subjected to a similar analysis. Finding 
the most appropriate framework for social simulation will lay the groundwork for 
Chap. 7, in which one of the more prominent exemplars of social simulation will be 
subjected to theoretical and methodological analysis. Chapter 7 lays the foundation 
for the conclusions of Part II, in which our analysis of Schelling’s residential 
segregation model will provide a means to demonstrate the most important elements 
of a useful modelling framework for the social sciences. 
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6.2 Frameworks and ALife: Strong ALife 


6.2.1 Strong ALife and the Lack of ‘Real’ Data 


As noted by critics of artificial life, social simulation and related methodologies, 
computational simulations suffer from a perceived lack of ‘real’ data, or data derived 
from experimental observation. Part of this inherent difficulty stems from the need 
for abstraction in many such models; for example, connectionist models of cognitive 
processes embrace the idea of ‘distributed representation’ and their potential role 
in cognition, while generally avoiding integrating those models into larger, more 
complex neural structures as seen in the brain (Rumelhart and McClelland 1986). 

Strong ALife simulations suffer even more strongly from this shortcoming. 
Ray’s Tierra provides an enticing look at a ‘digital ecology’ composed of evolving 
computer programmes competing for memory space (Ray 1996), but those creatures 
are purely artificial constructions. While the parasites and hyper-parasites which 
eventually evolve in Tierra’s world may provide an analogue to real-life parasitic 
behaviour, the specialised nature of their virtual world is such that analyses of 
Tierra would be very difficult to apply to the natural world. Ray might argue that 
his open-ended evolutionary system, which lacks the standard fitness function of 
many genetic algorithms and instead provides selection only through life and death, 
evokes real-world interactions in evolving systems. Does the difficulty of verifying 
such claims confine analyses of Tierra to the realm of mathematical curiosity? 


6.2.2 Artificial’ vs Artificial’: Avoiding the Distinction 


As noted in Silverman and Bullock’s description of differing varieties of artificiality 
in science (Silverman and Bullock 2004), providing a distinction between man- 
made instances of natural systems and man-made facsimiles of natural systems is 
important to understanding the goals of a simulation. In the case of strong ALife, 
researchers aim to produce models that embody Artificial', or a man-made instance 
of something natural; these models claim to create digital life, rather than simply 
resemble biological life. In this case, the strong ALife researcher falls into the role 
of a sort of digital ecologist, studying the behaviour and function of real, albeit 
digital, organisms. Of course, such claims seem remarkable, but in the absence of a 
complete and verifiable definition of life such claims are difficult to refute. 


6.2.3 Information Ecologies: The Importance of Back-stories 


The inclination of the strong ALife researcher to study ‘real’ digital organisms 
points to the importance of formulating a theoretical backstory for any given 
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simulation model. As per Silverman and Bullock’s PSS Hypothesis for Life, 
presuming that: 


1) An information ecology provides the necessary and sufficient conditions for life. 
2) A suitably-programmed computer is an example of an information ecology. 


... then the strong ALife researcher may claim, under such a framework, that 
their simulation represents an information ecology, and thus is a digital instantiation 
of a biological system. Whether or not the low-level functions of that system match 
those of real-life, carbon-based biological systems is immaterial; the only criterion 
is the presence of an ecology of information in which genetic material competes for 
representation, and under this criterion the position stated here is justifiable. 

To use our central example, if we construct a bird migration model in Alife 
fashion using individual interacting agents, we may wish to demonstrate that we 
are in fact performing some form of empirical data collection, rather than simply 
investigating an interesting mathematical system. So, we claim in this case that are 
simulated birds do in fact demonstrate an information ecology; perhaps our agents 
evolve and reproduce, this producing a dynamic of information amongst the agent 
population. If we follow Langton, and are willing to class ourselves in the strong 
Alife camp, then signing up to the PSS Hypothesis for Life may be a good course 
of action for us to take. In that case, our bird model becomes a true information 
ecology, and thus presents an opportunity for empirical data-collection in this virtual 
population of birds. 


6.3 Frameworks and ALife: Weak ALife 


6.3.1 Artificial’ vs. Artificial’ : Embracing the Distinction 


In contrast to strong ALife, weak ALife faces some initially more daunting 
theoretical prospects. This seems somewhat paradoxical; after all, the strong ALife 
researcher seeks to equate digital organisms with natural organisms, whereas the 
weak ALife researcher seeks only the replication of certain properties of natural 
life. However, while the strong ALife researcher may justify their investigations 
into a digital ecology by signing up to an appropriate theoretical backstory, however 
far-fetched, proving the relation between a natural system under investigation and a 
computational model based on that system is a more difficult problem and requires 
more in-depth justifications. 

Returning to our central example, recall our researcher who wishes to model the 
flocking behaviour of migrating birds. While a great deal of experimental data exists 
upon which one can base such a computational study, the researcher must choose 
which elements of that data provide a useful background for the simulation and 
which do not. Data about bird migration varies greatly across different species and 
climates, and the researcher must identify the most salient forms of collected data 
to use as a basis for the model. These choices will in turn inform the construction 
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of the model, and the related points of theory regarding migration that must be 
incorporated into that model. 

As Chalmers suggests (Chalmers 1999), these abstractions reveal the inherent 
theory-dependence of artificial life as an enterprise; the researcher’s choice of 
abstractions may conform to their specific theoretical biases. In order to make these 
choices effectively, and to draw a strong correlation between the digital system 
and the natural system, one must find a framework which embraces the inherently 
artificial nature of such simulations and uses the advantages of the digital medium 
effectively to draw useful experimental conclusions. 


6.3.2 Integration of Real Data: Case Studies 


6.3.3 Backstory: Allowing the Artificial 


The integration of a suitable theoretical backstory into weak ALife research seems 
a more difficult task than for strong ALife. As Bryan Keeley describes, weak ALife 
can only hope to be functionally related to natural life, producing behaviours that 
are analogous to those displayed in biology (Keeley 1997). However, establishing 
a clear relationship to a natural system is not always straightforward, particularly 
when the artificial system under consideration bears less resemblance to the 
fundamental make-up of the natural world. 

For some researchers and theorists, these artificial worlds present a tantalising 
opportunity to examine characteristic evolutionary behaviours in a simplified envi- 
ronment, one amenable to study; given that natural evolution is far more difficult to 
observe than an abstracted evolutionary algorithm, this naturally seems an attractive 
prospect for those who wish to observe evolution in action. Ray (1994, 1996) goes 
so far as to assert that digital forms of evolution can produce the same diversity 
of forms that we observe as a result of natural evolution (though perhaps given his 
statements regarding Tierra as a whole, this position is, for him, milder than most). 

Unfortunately for Ray, while Tierra does display an impressive variety of self- 
reproducing digital entities that display unexpected behaviour, the simulation tends 
to get caught in an eventual evolutionary cycle in which certain forms repeat. This 
contrasts strongly with real-world evolutionary behaviour, in which the general 
level of complexity in evolving populations tends to continue to increase over 
time. Similarly, Avida and other simulation environments developed by the ALife 
community suffer the same problem, preventing the community from replicating 
the sort of staggering diversity seen in nature (Adami and Brown 1994). Bearing 
this in mind, can the researcher be certain that these artificial evolutionary systems 
are truly functionally related to natural evolutionary systems? Given that the overall 
processes of evolution are still under constant debate and revision, and the innate 
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difficulty of providing appropriate selection pressures in an artificial environment, 
how much do these systems coincide on a ‘given level of abstraction’ (Keeley 1997). 

In cases such as this, one must be careful in defining a theoretical backstory 
linking the natural system to the artificial. A clear statement of the mechanisms 
at work in the simulation and how they relate to similar theorised mechanisms in 
natural evolution seems most helpful; given that there is no fundamental metric 
to determine just how abstracted a given model is when compared to reality, a 
clear statement of assumptions made and potential confounds in the simulation (i.e., 
difficulties in fitness functions and similar issues) could be helpful in attempting to 
link the simulation results to empirical results. 

In the case of our bird example, such a backstory would need to include 
information about the assumptions made when constructing our simulation. We 
would have to describe the real-world instances of bird behaviour that we are trying 
to replicate, and how these real-world instances have influenced our implementation 
of the model. Where simplifications have been made, i.e. by simplifying the 
structure of the agents to facilitate computability and simplicity of analysis, we 
would need to note this fact and mention how these simplifications may change 
the character of the behaviour observed in the simulation. In the ideal situation, 
someone reading a paper describing our bird model should be made aware of the 
shortcomings of the simulation, where it attempts to reproduce real-world bird 
behaviour and physiology, and where it makes abstractions for the purposes of 
making the simulation tractable. 


6.4 The Legacy of Levins 


6.4.1 The 3 Types: A Useful Hierarchy? 


As discussed at length in Chap.4, the Levinsian framework for modelling in 
population biology appears generally useful for ALife modelling endeavours. After 
all, Levins seemed intent upon creating a pragmatic framework for constructing 
biological models, and since ALife often falls within that remit his ideas remain 
relevant. However, the extended framework developed from Levins in Chap. 4 seems 
perhaps more useful within the context of artificial life. 

With Braitenberg’s Law in mind, the concept of a tractability ceiling placed on 
ALife seems appropriate, if vexing. With ALife systems spanning an enormous 
variety of biological phenomena, often incorporating versions of vast and complex 
biological mechanisms such as evolution, the question of analysis becomes of 
paramount importance. While we may comfortably classify ALife models within 
Levins’ three types with relatively little difficulty, we remain uncertain how 
analysable those models will prove to be with only that classification in mind. 
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6.4.2 Constraints of the Fourth Factor 


Indeed, the fourth Levinsian factor appears to place some serious limitations upon 
ALife systems. As noted in Chap.3, such systems frequently fall into the Type 
3 category, oriented as they are toward producing broad-stroke investigations of 
generalised populations. Applying those results toward real populations becomes 
increasingly problematic as tractability concerns become important; without rea- 
sonable analysis of the data produced by these simulations, the researcher will 
have great difficulty applying that data to any understanding of a natural biological 
system. In essence, even with carefully-designed simulation models, this tractability 
ceiling prevents highly complex simulations from being productive of great insight. 

Thus, our Alife-type bird migration model may run into difficulties if we 
incorporate too many real-world complexities. If we use agents with neural network 
controllers, for example, then such networks are very difficult to analyse (recall Beer 
2003a,b). As modellers we must judge whether the use of such components in the 
simulation is justified given the increase of complexity and analytical difficulty. The 
more we incorporate added elements in an attempt to capture real-world complexity, 
the more we approach the tractability ceiling. 

Even assuming that our model manages to balance Levins’ three types appro- 
priately while maintaining a reasonable level of tractability, significant problems 
still remain. With Braitenberg’s Law in mind, agent-based models in general 
suffer a greater disconnect between invention and analysis than other modelling 
methodologies, leading to a greater chance of producing impenetrably opaque 
simulations. 

If we then apply agent-based methodologies to the field of social science, in 
which there are already significant difficulties in constructing models based upon 
strongly empirical social data, then these problems become increasingly acute. 
Bearing in mind these discussions of theoretical frameworks in relation to ALife 
and agent-based models in general, we shall examine how these same frameworks 
impact upon the agent-based social simulations detailed in the previous chapter. 


6.5 Frameworks and Social Science 


6.5.1 Artificial! vs. Artificial’: A Useful Distinction? 


ALife research focuses on living systems and their governing processes, and as such 
relies upon definitions and theories of life as a basis for enquiry. Life being difficult 
to define empirically, strong ALife can, as illustrated earlier, attempt to demonstrate 
a digital instantiation of life itself when life is defined appropriately within the 
context of that research. However, when expanding beyond processes relating to 
simple organisms and populations and attempting to model social structures and 
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processes, additional layers of complexity come into focus. Epstein (1999) provides 
a comparison to the famous “Boids’ model of flocking behaviour: 


Generating collective behaviour that to the naked eye “looks like flocking” can be extremely 
valuable, but it is a radically different enterprise from generating, say, a specific distribution 
of wealth with parameters close to those observed in society. Crude qualitative caricature is 
a perfectly reasonable goal. But if that is one’s goal, the fact must be stated explicitly.... 
This will avert needless resistance from other fields where “normal science” proceeds under 
established standards patently not met by cartoon “boid” flocks, however stimulating and 
pedagogically valuable these may be. (p. 52-53) 


In essence, Epstein presents a position reminiscent of our earlier discussion of 
Braitenberg: imitation of a system with a model, as in mimicking the behaviour 
of flocking birds, is far simpler than creating a model system which can generate 
that behaviour. Further, such models can confuse the research landscape, perhaps 
claiming to produce more insight than they are fundamentally capable of providing. 
In such cases a model which seeks this sort of qualitative similarity is not 
constructed in such a way as to allow any insight into the root causes of that complex 
behaviour. Epstein implies later in his discussion that in the case of social science, 
which involves interacting layers of actors in a society, this problem becomes more 
acute for the computational modeller. 

In such a context, any argument for ‘strong social-science simulation’ seems 
difficult to justify; in order to accept that a given social model is a digital 
instantiation of a social system, one would have to accept that the agents have 
sufficient complexity to interact with one another, generate a social structure, and 
react and respond to that structure in an appropriately non-trivial manner. There is a 
significant danger, as noted by Epstein, of a model of a complex social system falling 
within the realm of a ‘crude qualitative caricature.’ Social science then may be said 
to lie within the domain of Artificial*: something made to resemble something else. 

Fortunately, our examination of Luhmann has revealed the problematic nature 
of building an Artificial! social simulation. With our search for a fundamental 
social theory relying upon developing a new means for hypothesis-testing and social 
explanation, creating instantiations of ‘digital society’ is rather less than useful. 
Thus, while remaining within the domain of Artificial? may seem initially limiting, 
in fact the social simulator is likely to find it of far greater utility than the alternative. 


6.5.2 Levins: Still Useful for Social Scientists? 


As demonstrated earlier in this chapter, Levins’ framework for modelling in 
population biology is remarkably applicable to today’s more modern computational 
methodologies. Our updated Levinsian framework developed in Chap.4 and its 
consideration of tractability presents an account of a concern common to most 
varieties of computational models: efficient utilisation of computing resources relies 
on a relatively tractable and analysable problem. However, when considering the 
application of such a framework to social simulation, some differing concerns come 
to light. 
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As Gilbert and Tierna note (Gilbert and Terna 2000), emergent phenomena in 
social science reflects a certain additional complexity in comparison to the natural 
sciences: 


In the physical world, macro-level phenomena, built up from the behaviour of micro- 
level components, generally themselves affect the components.... The same is true in the 
social world, where an institution such as government self-evidently affects the lives of 
individuals. The complication in the social world is that individuals can recognise, reason 
about and react to the institutions that their actions have created. Understanding this feature 
of human society, variously known as second-order emergence Gilbert (1995), reflexivity 
Woolgar (1988) and the double hermeneutic, is an area where computational modelling 
shows promise. (p. 5) 


Thus, Gilbert and Tierna contend that agent-based models can capture this emer- 
gent phenomena more vividly than other methodologies which, while providing 
potentially strong and useful predictions of a macro-level system’s behaviour, do 
not provide an explanation of that behaviour in terms of its component parts. Epstein 
(1999) takes this statement further, arguing that in certain contexts, successful math- 
ematical models may be ‘devoid of explanatory power despite [their] descriptive 
accuracy’ (p. 51). Epstein goes on, however, to acknowledge the difficulties inherent 
in creating an artificial society with sufficient “generative power, proposing the use 
of evolutionary methods to find the rule-sets most amenable to complex emergent 
behaviour in a given simulation. 

Levins’ framework, which depends upon tractability as a key concern for the 
modeller, may at first blush seem insufficient to deal with the methodological 
complexities inherent in the simulation of social structures. After all, the researcher 
is dealing with multiple interacting layers of complexity, with some of these 
emergent behaviours relying upon not just reactions to a changing environment, 
as in an artificial life simulation, but a cognitive reaction to that environment, which 
can then influence both that agent and others within that artificial society. With such 
high-level abstractions taking place in these simulations, how might one quantify 
the realism, generality and precision of a social model? 

To clarify, imagine that we have designed and implemented an agent-based 
model of our migrating bird population. Each agent is capable of moving through 
a simulated spatial environment, reproducing, and thus evolving through simulated 
generations. Now, in an effort to capture the social effects at play within a bird 
colony upon arrival at its destination, we allow each agent to communicate and 
exchange information with its compatriots. In order to capture this we suddenly 
need to add all sorts of new elements to our implementation: a means of encoding 
commuication between individuals, means for choosing the method and frequency 
of those communications, and deciding how these interactions will affect the agent, 
its communication partners, and the surrounding environment. How can we capture 
such effects? How might we model the birds’ communications, and how they affect 
the simulation as a whole? Already we have introduced a number of new factors 
into the model for which there is little hard empirical information to guide our 
implementation of these factors. If we cannot identify the level of realism of these 
new components of the simulation, how is it possible to clarify the position of our 
new model amongst Levins’ four dimensions of model-building? 
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In essence, without a solid set of criteria identifying the realism of simulated 
social behaviours and cognition, one becomes reliant on qualitative judgements 
to decide the validity of a social simulation; in other words, the researcher must 
determine that the behaviour of that artificial society sufficiently resembles the 
behaviour of a real society to call it a successful result. Perhaps then social 
simulations begin skewed away from realism, and further toward generality and 
precision than other simulation varieties. With realism so difficult to define in a 
social context, and with the necessary abstractions for social simulation so wide- 
ranging, the researcher seems best served by striving to provide examples of possible 
mechanisms for an emergent social behaviour based upon a very general picture 
of agent-based interactions and communications. Definitive statements regarding 
which of these mechanisms are responsible for a given social phenomenon may be 
impossible, but the model could provide a means for illuminating the possible role 
of some mechanisms in the formation and behaviour of social structures (this being 
one of Epstein’s suggested roles for simulation in ‘generative social science’). 


6.5.3 Cederman’s 3 Types: Restating the Problem 


If Levins’ framework is insufficient to capture some of the methodological complex- 
ities particular to modelling for the social sciences, perhaps this framework could 
be usefully informed by Cederman’s own framework of C1, C2 and C3 political 
models (see Table 5.1). With some examination of this framework, we may be able 
to draw useful parallels between these three types and those described by Levins. 

Cederman’s C1 models focus on modelling the behavioural aspects of social 
systems, or the emergence of general characteristics of certain social systems in a 
computational context. He cites Axelrod’s Complexity of Cooperation as a major 
foray into this type of modelling, as well as Schelling’s residential segregation 
model (Cederman 2001). This type of model seems related to Levins’ L3 models, 
which sacrifice precision for realism and generality; Cederman’s C1 models do not 
attempt to link with empirical data, but instead seek to demonstrate the emergence 
and development of behavioural phenomena in a generalised or idealised context. 

Cederman’s C2 models aim to explain the emergence of configurations in a 
model due to properties of the agents within the simulation (Cederman 2001). 
Examples of this methodology include Lustick’s Agent-Based Argument Repertoire 
Model (Lustick 2002, 2006), which provides agents with a complex set of opinions 
which they may communicate with other agents. The opinions of agents within 
the simulation can alter through these communications, leading to the apparent 
generation of social structures within the agent population. 

In this case, the comparison with Levins is more difficult; while the agents 
themselves are designed to be more complex within a C2 model, can this necessarily 
be pinned down as an increase in either precision or realism? The closest analogue 
appears to be Levins’ L2 models, which eschew realism in favour of generality and 
precision. While a C2 Cederman model or an L2 Levins model is not concerned 
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with comparison to empirical data, these models do seek to provide a useful 
framework for describing complex social behaviours in an idealised context, similar 
to those population biology models alluded to by Levins. In either discipline, the 
modellers hope to illuminate some of the contributing factors that produce the 
modelled behaviour, and perhaps stray closer to a useful description of the real- 
life instantiation of that behaviour than may be expected initially from such an 
abstracted model. 

Cederman’s C3 models are the most ambitious in that they attempt to model 
the emergence of both agent behaviours and their interaction networks (Cederman 
2001). He specifically cites ALife as a field which may provide illuminating 
insights in that regard, given that ALife is concerned with such complex emergent 
behaviours and structures. As discussed earlier, however, Cederman’s categories 
have no hard borders between them, particularly in the case of C3 models; Cederman 
himself admits that C1 and C3 can easily overlap. As a consequence, identifying 
where the C3 models lie amongst a given crop of social science simulations is not 
always such a simple task, though the C3 categorisation does provide useful context 
in which to examine how the goals of different varieties of models can affect their 
construction and implementation. 

Once again, though, the comparison with Levins is difficult; allowing for such 
emergent behaviours as described by Cederman in the C3 categorisation does not 
correlate directly with Levins’ three categorisations. Perhaps the closest analogue 
here is once again Levins’ L3 models, given the focus on emergence of both agent- 
level and societal-level behaviours; in neither case is the modeller overly concerned 
with a direct relation to empirical data. Instead the modeller hopes to provide a 
cogent explanation for the emergence of social behaviours by allowing agents to 
interact and change due to pressures within the model, rather than due to complex 
constraints placed upon that model. 


6.5.4 Building the Framework: Unifying Principles for Biology 
and Social Science Models 


We have clearly run into a difficulty in comparing Levins and Cederman’s respective 
modelling frameworks. Levins reserves the L1 category for those models which 
seek a direct relationship to empirical data, leaving generality behind in favour 
of realism and precision. However, Cederman’s 3 types leave out this distinction, 
instead providing what appear to be two variations on Levins’ more general L3 
models. Both Cederman’s C1 and C3 models appear to leave precision behind, 
seeking instead to describe social phenomena in an idealised context; C1 models 
focus purely on the emergence of social structures, while C3 models focus on the 
emergence of both societal and individual structures and behaviours. 

Cederman’s view can be used to further qualify Levins’ original framework 
however. Levins himself cites L3 models as the most promising, and his preferred 
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Table 6.1 One possible Levins/Cederman framework 


Modified Levinsian framework 


Ll Precision and realism at the expense of generality 

L2 Generality and precision at the expense of realism 

L3A Generality and realism at the expense of precision at one level of simulation 

L3B Generality and realism at the expense of precision at multiple levels of simulation 


methodology within population biology; similarly, Cederman cites his C3 models 
as the most promising within political and social science (though interestingly, 
Cederman’s recent work has strayed towards a version of Levins’ L1; see Cederman 
2006 and the section below). The question then becomes: how can we characterise 
Cederman’s C1 vs. C3 distinction in terms of the Levinsian factors of generality, 
precision and realism? 

One can envision a further subdivision of Levins’ L3 into a L3A and L3B: 
L3A being characterised by a sacrifice of precision in one level of the simulation 
(i.e., Cederman’s Cl which seeks emergent social behaviours), and L3B being 
characterised by a sacrifice of precision at multiple levels (as in Cederman’s C3, 
which seeks both emergent social behaviours and interaction networks); Table 6.1 
provides a summary of this potential framework. 

Alternatively, with the incorporation of tractability into the Levinsian framework 
as a fourth factor as in Chap. 4, this subdivision may be much simpler. Both Levins 
and Cederman acknowledge these L3 models to be both the most useful and the 
most challenging; perhaps then Cederman’s C1 and C3 may be characterised by 
differing levels of tractability. In this way we can maintain this modified Levinsian 
framework as a more fluid continuum, based upon the determining factor of overall 
tractability, rather than introducing additional sub-categories for special cases of 
specific methodologies. 


6.5.5 Integration of Real Data 


Some researchers involved in computational modelling have attempted to sidestep 
the difficulties of the inherent lack of ‘real data’ by attempting to integrate exper- 
imental data into their simulations. Cederman’s 2006 paper on geo-referencing in 
datasets for studies of civil wars provides a useful example. In this case, Cederman 
uses data and maps from the Russian Atlas Narodov Mira, an atlas produced by a 
1960s Soviet project aiming to chart the distribution of ethnic groups worldwide. 
The project then seeks to formulate agent-based models using this data to examine 
the potential causes of ethnic conflict. As Cederman notes, ‘there is no substitute 
for real-world evidence’ when attempting to understand the causes of such conflict; 
however, the ethnographic data in this case is both limited and quite old. 
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Of course the worldwide ethnographic distribution has likely changed signif- 
icantly since the publication of this atlas in 1964, and updating the information 
contained therein is no simple task. The integration of such data into an agent- 
based computational model seems like a potentially fruitful method for tying the 
results of that model more closely to political and social reality. However, with the 
limitations of this dataset and the difficulties inherent in collecting future data of 
a similar type, is this data integration still useful as a framing mechanism to place 
this model on more solid empirical ground, or is it an interesting but ultimately 
misguided enterprise? 

A further difficulty with the Atlas Narodov Mira dataset is that the atlas provides 
a static ethnographic picture. While it is a remarkably detailed look at a particular 
time in world history and the related distributions of peoples throughout the globe, 
the lack of similar data in the following decades leaves us with an inability to directly 
associate the intervening political and social changes with ethnic conflicts that have 
erupted in the years since the atlas’ publication. Some argue that computational 
modelling in such a circumstance provides a remarkable capacity for hypothesis 
testing; by basing a relatively realistic model on such a solid footing of experimental 
and observational data, the researcher can experiment with varying parameters and 
initial conditions in an attempt to replicate the ethnic conflicts seen since the atlas 
was produced. However in such a case the same problems return to haunt the 
researcher: deciding which abstractions to make can be critical, and deciding how 
to formalise the influence of complex social and political interactions is far from 
trivial. 


6.6 Views from Within Social Simulation 


6.6.1 Finding a Direction for Social Simulation 


While the last chapter provided an in-depth examination of the current state of social 
simulation, and an analysis of its ability to explain and interpret real-world social 
phenomena, we still have relatively little understanding of the perception of the 
utility of social simulation within the social sciences. A number of prominent social 
simulators display great enthusiasm for the pursuit, as would be expected, but how 
do those viewing this growing trend from other parts of the field react? 

With this in mind we will examine some views of the general purpose of social 
simulation, and descriptions of the perceived problems of the methodology, from 
within the social sciences. We can then incorporate these analyses with our own 
discussion of the importance of simulation in a fundamental social theory and 
further expand our growing methodological and theoretical framework for social 
simulation. 
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6.6.2 Doran’s Perspective on the Methodology 
of Artificial Societies 


Doran in his article for Tools and Techniques for Social Science Simulation 
provides a coherent examination of the major difficulties facing the artificial society 
methodology in social simulation (Doran 2000). Doran describes this method as 
follows: 


The central method of artificial societies is to select an abstract social research issue, 
and then to create an artificial society within a computer system in which that issue 
may systematically be explored. Building artificial societies is a matter of designing and 
implementing agents and inter-agent communication, in a shared environment, and existing 
agent technology provides a wealth of alternatives in support of this process. (p. 18) 


Thus, Doran views social simulation as a means of examining the workings of 
large-scale, abstract social issues. For Doran simulation provides a way to generate 
new ‘world histories,’ allowing the researcher to watch a society grow from a 
provided set of initial conditions and assumptions. 

Of course he notes the significant methodological difficulties inherent in this 
artificial society approach. He identifies three main problems: searching the space 
of initial conditions and parameters; describing the results of the simulation in a 
tractable way; and overcoming problems of instability due to changes in initial 
conditions. The first and third problems here are highly reminiscent of more 
general problems common to agent-based modelling as a whole, while the problem 
of analysis and tractability harkens back to Levins and our initial theoretical 
explorations of artificial life. 

Interestingly, however, Doran posits that the greatest problem facing the social 
simulator is the largely undefined nature of the computational agent itself. He notes 
that different uses of agent-based models often incorporate different base properties 
for the agents within the model, and that despite the simplistic views of what 
constitutes an agent within social science, there is little overall consensus regarding 
a definition of agent architectures. 

He argues that agents should be defined in computational terms, and thus should 
be able to emerge from a model in the same way as other more complex phenomena. 
Of course, this is not the normal course in most agent-based models; in practice 
such models are designed to include predefined agent structures based upon certain 
theoretical assumptions. As a consequence, this would not be an easy task for 
the modeller; constructing a simulation in which agent structures are expected to 
emerge seems extremely difficult. There would almost certainly need to be some 
sort of precursors to defined agents to allow such structures to develop, given that 
the earliest beginnings of our own life and society are so murky to begin with and 
can provide little clear inspiration. The purpose, however, is sound, as producing 
simulations that at least approach such possibilities would allow the modeller to step 
away further from the difficulties produced by theory dependence and pre-defined 
agent structures. 
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Doran’s view is also interesting in that it meshes with our earlier discussion of 
Niklas Luhmann and the search for a fundamental social theory. Since in Doran’s 
view, artificial societies aim to examine the general development of human society in 
an abstract fashion, these societies must be based upon valid assumptions. However, 
those assumptions are necessarily based upon our own cultural experiences, and thus 
will be imprinted upon that model in some fashion. A model which eliminates this 
difficulty, or at least minimises it, would allow for agents to develop in simulation 
which bear far fewer markings of our own societal preconceptions. For the social 
scientist looking to develop new social theory, such an approach would be far more 
fruitful than the heavily theory-dependent alternative. 

As discussed in the previous chapter, Doran agrees that artificial societies should 
strive to develop from the earliest beginnings of societal interaction (‘the lowest 
possible level, even below the level of agent’ in Doran’s phrasing [p. 24]), and 
that the mechanisms of society should emerge from that basis. This would avoid 
producing a simulation constructed around pre-existing assumptions from our own 
society. 


6.6.3 Axelrod and Tesfatsion’s Perspective: The Beginner’s 
Guide to Social Simulation 


Axelrod and Tesfatsion (2005) lay out a brief guide to social scientists hoping to 
incorporate social simulation into their current work. Given the authors’ prominent 
position in the current social simulation field, this guide provides an illuminating 
look at those aspects of agent-based modelling perceived to be the most valuable by 
those within this area of research. 

After beginning with a brief introduction to the basic characteristics of agent- 
based models (including a brief discussion on creating ‘histories,’ as described by 
Doran), Axelrod and Tesfatsion lays out a four-part description of the potential goals 
of simulation models in the social sciences. They describe each of these potential 
goals in turn: 


1. Empirical understanding: ‘ABM researchers seek causal explanations grounded in the 
repeated interactions of agents operating in specified environments.’ 

2. Normative understanding: ‘ABM researchers pursuing this objective are interested in 
evaluating whether designs proposed for social policies, institutions, or processes will 
result in socially desirable system performance over time.’ 

3. Heuristic: ‘How can greater insight be attained about the fundamental causal mecha- 
nisms in social systems?’ 

4. Methodological advancement: ‘How best to provide ABM researchers with the methods 
and tools they need to undertake the rigorous study of social systems through controlled 
computational experiments?’ [p. 4-5] 


Here we discover yet another framework underwriting agent-based models in 
social science. However, unlike the work of Levins or Cederman, Axelrod and 
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Tesfatsion prefer not to discuss specific criteria which may place a given agent-based 
model within these categories, preferring instead to provide only examples of each 
research goal. 

Interestingly, despite their initial mention of Doran’s described ‘artificial soci- 
eties’ methodology, and the general goal of generating artificial ‘world histories,’ 
the remainder of Axelrod and Tesfatsion’s introduction presents a sharp contrast 
to Doran’s ideas. They stress the potential for agent-based models to produce 
substantive empirical insights related to real-world societies, rather than produce 
more general insights about the initial formation of societies and social interaction 
as a whole. 

However, given the caveats presented thus far with agent-based social simulation, 
Axelrod and Tesfatsion’s research goals seem at odds with the conclusions we have 
reached. If we agree, as discussed in the last chapter, that social simulation may 
lack some critical elements required to provide social explanation, then hoping to 
use such simulations to design real-world social institutions may produce disastrous 
results. Similarly, developing empirical understanding and ‘causal explanations’ 
would be equally problematic, particularly as any non-reductive aspects of the 
society under investigation would be difficult to identify. 

Even if we disagree with the alleged difficulties in social explanation for social 
simulation, Doran’s views have illuminated a further difficulty for Axelrod and 
Tesfatsion’s suggested approaches. Without a clear definition of what constitutes 
an ‘agent’ in an empirical or normative simulation, the level of correspondence 
between these agents and human social actors is too ill-defined. What cognitive 
capacities would agents require to provide that degree of understanding? Surely 
Schelling-esque simplicity is not going to be the best course for all possible 
investigations using simulated societies? 


6.7 Summary and Conclusions 


Thus far we have seen a great variety of proposed theoretical frameworks for 
the use of agent-based models. Artificial life provided a useful starting point, 
given its relationship to the long-standing use of mathematical models in various 
biological disciplines. Comparing the methodological difficulties of artificial life 
with mathematical models in biology allows us to develop a greater understanding 
of the problems most particular to agent-based models. 

However, applying the resulting methodological discussions to agent-based 
models in social science is not so straightforward. As discussed in Chap. 5, social 
scientists must cope with some unique difficulties: social structures are not clearly 
hierarchical in nature; empirical studies of social phenomena are frequently prob- 
lematic due to the interacting complexities of individual and collective behaviour; 
and social explanation may suffer from difficulties similar to those faced by 
researchers in mental phenomena, as some social phenomena may be similarly 
irreducible to straightforward low-level behaviour. 


122 6 Analysis: Frameworks and Theories for Social Simulation 


Obviously all of these points impact the social simulator and make the job of 
model-building significantly more difficult. However, as seen during this analysis, 
these difficulties shift in emphasis depending on the purpose of the model (as we 
might expect from previous discussion regarding the Levins and Cederman frame- 
works). Taking an external, Luhmannian perspective, and incorporating Doran’s 
related views, social simulation may be able to provide a unique window into certain 
aspects of social theory that are otherwise inaccessible through standard empirical 
means. 

This methodology brings us back to the artificial world problem of Chap. 4. If 
we accept that ‘growing’ artificial societies from a level even below that of agents is 
a promising means for investigating potential fundamental social theories, we must 
likewise accept that these artificial societies may be quite difficult to relate to real- 
world societies. Given the gaps in our understanding in social science, ranging from 
unanswered questions in individual behaviour through to questions of high-level 
social organisation, can such artificial societies bring us any closer to the desired 
fundamental social theory? 

The next chapter will bring us closer to answering this critical question. By 
examining one of the most prominent exemplars of social simulation, Schelling’s 
residential segregation model, we will be able to illuminate this concern in greater 
detail. Schelling’s model is the very definition of abstraction — and yet it is credited 
with producing some important insights into the social problem it is based upon. 
By determining how Schelling’s model surpassed its abstract limitations, we can 
develop a framework which may underwrite future models in social simulation of a 
similar character. 

This analysis of Schelling marks a conclusion of sorts to the arguments and 
frameworks discussed thus far in Parts I and IJ. Having discussed modelling 
frameworks in Alife and biology, and having brought those together with new 
elements from simulation in the social sciences, Schelling’s model will provide 
a means to demonstrate the importance of these theoretical, methodological, and 
pragmatic frameworks for the modeller who wishes to push social science forward 
through simulation. As we shall see, Schelling’s simple model was not nearly so 
simple in its widespread influence and overall importance. The elements which 
contributed to this success can be seen via its relationships to the frameworks 
discussed thus far. 
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Chapter 7 
Schelling’s Model: A Success for Simplicity 


7.1 Overview 


In the previous chapter the numerous methodological and theoretical frameworks 
elucidated thus far were analysed as a whole in an attempt to develop a useful 
synthesis for the social science modeller. Bringing together elements from popu- 
lation biology, artificial life, and social science itself, we examined the multiple 
dimensions that must be addressed in a successful social simulation model. 

In this chapter we will examine one particular example of such a successful 
model: Schelling’s residential segregation model. After describing the context in 
which this model was developed, we will analyse this relatively simple model under 
the constraints of each of the theoretical frameworks described thus far. Despite the 
abstractions inherent in Schelling’s construction, the model achieved a remarkable 
measure of recognition both inside and outside the social science community; our 
analysis will discuss how this ‘chequerboard model’ found such acceptance. 

Using Schelling as a case study, we can further develop our fledgling modelling 
framework for the social sciences. In the quest to underwrite social simulation 
as a more rigorous form of social-scientific exploration, Schelling will provide a 
valuable examination of the issues facing modellers as this methodology grows in 
relevance throughout social science. After discussing the issues brought forward by 
this analysis, which brings together the frameworks and methodological proposals 
discussed thus far, we will move on to the last chapter of Part II and present a 
synthesis. 
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7.2 The Problem of Residential Segregation 


7.2.1 Residential Segregation as a Social Phenomenon 
7.2.1.1 The Importance of the Problem 


One of the most puzzling aspects of the sociological study of residential segregation 
is the sizeable gap between the individual residential preferences of the majority 
of the population and the resultant neighbourhood structure. The General Social 
Survey of major US metropolitan areas asked black respondents whether they 
preferred to live in all-black, all-white, or half-black/half-white areas; 55.3% of 
those surveyed stated a preference for half-black/half-white neighbourhoods (Davis 
and Smith 1999). However, census data reveals that very small percentages of black 
individuals in major US metropolitan areas actually live in half-black/half-white 
neighbourhoods; the 1990 census indicates that for most major cities, less than 5% 
of black residents live in such mixed areas. 


7.2.2 Theories Regarding Residential Segregation 


The phenomenon of residential segregation is a complex one, involving interacting 
contributing factors at various levels of societal structure. A definitive statement of 
why residential segregation occurs still troubles researchers; the problem varies so 
widely across cultures, societies and ethnic groups that the specific factors at play 
are still elusive. 

Freeman’s summary of residential segregation of African-Americans in major 
American metropolitan areas (Freeman 2000) provides one example of the diffi- 
culties facing social scientists in this area. As Freeman notes, African-Americans 
seem particularly prone to residential segregation; other minority populations tend 
to ‘spatially assimilate’ after a period of residential segregation, integrating with the 
majority population as educational and financial disparities decrease (Massey et al. 
1985). African-American populations, however, do not display significant spatial 
assimilation, despite the decrease in socio-economic disparities evident from the 
available data; some posit that this may be due to unseen bias in local housing 
authorities, cultural cohesion keeping African-Americans in established minority 
communities, or decreased access to public services in low-income areas, but none 
of these possibilities can fully account for these unusual tendencies in the data. 
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7.3 The Chequerboard Model: Individual Motives 
in Segregation 


7.3.1 The Rules and Justifications of the Model 


Schelling’s ‘“chequerboard model’ sought to make a singular point, as illustrated by 
the simplistic construction of the model. He argued that if the racial make-up of a 
given area was critical to an individual’s choice of housing, then even populations 
of individuals tolerant to mixed-race environments would end up segregating into 
single-race neighbourhoods (Schelling 1978). Schelling hoped to illustrate that 
large-scale factors such as socio-economic or educational disparities between ethnic 
populations could not explain the generally puzzling phenomenon of residential 
segregation; indeed, without a greater insight into the preferences and thought 
processes of individuals within a given population (their ‘micromotives’), some 
critical aspects of this social problem may elude the researcher. 

Schelling illustrated this idea using a model constructed in a simple fashion, 
using a type of cellular automaton model (initially constructed using a physical 
chequerboard, hence the model’s nickname). He describes a world composed of 
square cells, filled with agents of one of two types. Each agent interacts only with its 
eight direct neighbouring cells, and there are no higher-level structures incorporated 
into the model. The agents are given a tolerance threshold: if the number of adjacent 
agents of its own type is less than that threshold, the agent will move to a nearby 
location on the grid where its tolerance requirements are fulfilled. Thus, the model 
incorporates a very simple rule set which allows each agent to display a singular 
micromotive which is taken to represent that agent’s level of racial tolerance. 
Schelling hoped to demonstrate the powerful influence of this simple micromotive 
on the resulting racial structure of the neighbourhood inhabited by the agents. 


7.3.2 Results of the Model: Looking to the Individual 


The results of Schelling’s model were surprising to social scientists at the time. 
The model showed that, for a wide range of tolerance thresholds, these initially 
integrated neighbourhoods of tolerant agents would ‘tip’ toward one group or 
another, leading the outnumbered group to leave and resulting in segregation 
(Schelling 1971). Given that residential segregation was widespread in American 
cities prior to the civil rights movement, even the rising tolerance following the 
additional rights granted to African-Americans was not sufficient to provoke a 
significant decrease in residential segregation, even by 1990 (Sander 1998). 
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Schelling’s model demonstrates that a mere improvement in individual prefer- 
ences and a lack of housing discrimination (as outlawed by the Civil Rights Act of 
1968) are not sufficient to eliminate residential segregation. In fact, largely tolerant 
“‘microbehaviours’ can still result in such problems, as long as social factors such as 
race remain a consideration in housing choice. 


7.3.3 Problems of the Model: A Lack of Social Structure 


Since Schelling’s original model formulation and the resultant interest of the 
social science community, many researchers have since attempted to update the 
model with more sophisticated computational techniques and a greater number of 
influential social factors within the model. While Schelling’s model was accepted 
as a remarkable illustration of a simple principle regarding the large-scale effects of 
individual housing preferences, some modellers sought to create a more ‘realistic’ 
Schelling-type model which could incorporate socio-economic factors as well 
(Sander et al. 2000; Clark 1991; Epstein and Axtell 1996). Given the accepted 
complexity of residential segregation as a social problem, and the new insight into 
the effects of individual preference illuminated by Schelling, models incorporat- 
ing Schelling-style ‘micromotives’ and large-scale social factors were seen as a 
potential method for examining the interactions between these two levels of social 
structure, something that was very much lacking in Schelling’s original formulation. 


7.4 Emergence by Any Other Name: Micromotives 
and Macrobehaviour 


7.4.1 Schelling’s Justifications: A Valid View of Social 
Behaviour? 


Interestingly, despite the general simplicity of Schelling’s model and the lack of a 
larger social context for the agents in the model, his discussion of micromotives 
quickly gathered momentum among social scientists. His contention that certain 
non-obvious conclusions regarding social behaviour may follow from studies that 
do not depend upon empirical observation was influential, leading other modellers 
to seek patterns of interaction in generalised social systems [other tipping models 
here]. 

In addition, Schelling’s description of critical thresholds that lead to these ‘tip- 
ping’ phenomenon led to an influx of sociological models exploring this possibility 
in relation to numerous large-scale social phenomena. Within political science, 
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Laitin used tipping models to examine the critical points at which populations 
choose one language over another, as in the case of Ghana (Laitin 1994), and in the 
growing acceptance of the Kazakh native tongue over Russian in Kazakhstan (Laitin 
1998). In general, then, Schelling’s justifications for his residential segregation 
model have been widely adapted throughout the social sciences; this simple method 
for examining critical points in societal interaction seems to have generated a great 
deal of related research in the years following his book’s publication. 

Perhaps more importantly for the larger social science community, Schelling’s 
model also sparked additional interest in empirical studies over the years as social 
scientists wished to confirm his claims. The most prominent example of this 
model feeding back new ideas into the empirical domain was W.A.V. Clark’s 
study (Clark 1991). Clark used the most recent demographic surveys available at 
the time to examine elements of local racial preference in residential segregation 
in real communities; while the character of the results did differ from Schelling, 
the influence of local racial preferences was strong, confirming Schelling’s most 
important claim. This sort of empirical validation only lends further credence to 
Schelling’s methodology and its success. 


7.4.2 Limiting the Domain: The Acceptance 
of Schelling’s Result 


While Schelling’s model did not incorporate any semblance of large-scale social 
structure in its simple grid-world, this lack of detail may have contributed to the 
general acceptance of his model amongst the social-science community. While 
his work did provide a significant influence on later modelling work in the field 
and some empirical work, as noted above, his initial forays into the residential 
segregation problem were very limited in scope (Schelling 1971). 

Schelling aimed to illuminate the role of individual behaviours in producing 
counter-intuitive results in large-scale social systems, and his simple residential 
tipping model produced just such a counter-intuitive result by demonstrating the 
inability of individual tolerance for racial mixing to eliminate segregation. In this 
way the initial chequerboard model provided a theoretical backdrop for the larger 
thesis of his later book-length examination Micro-Motives and Macro-behaviour 
(Schelling 1978). Rather than producing one large, complex model which illustrated 
the importance of these individual preferences in the behaviours of social systems, 
he produced a number of small-scale, simple models which illustrated the same 
point in a more easily digestable and analysable way. Perhaps then the lack of 
complexity on display was what made his models so influential; providing such 
a basic backdrop made replication and expansion of his result straightforward for 
the research community. 
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7.4.3 Taylor’s Sites of Sociality: One View of the Acceptance 
of Models 


Computational modelling, as described earlier, is inherently a theory-dependent 
exercise. Modellers seek to simplify reality in order to examine specific behaviours 
within the system of interest, and those simplifications and abstractions often betray 
a theoretical bias. In addition to this difficulty, Taylor describes the concept of 
‘sites of sociality’ within modelling communities (Taylor 2000). In Taylor’s view, 
these sites correspond to points at which social considerations within a scientific 
discipline begin to define parameters of a model, rather than considerations brought 
about by the subject matter of the model itself. 

Thus, if a certain evolutionary theory has become dominant within a specific 
domain of population biology, for example, a model which ignores the conceits of 
that theory may not find acceptance among the community. Schelling’s modelling 
work served to add to current social theory in evidence at that time, but did not 
seek to overturn the dominant ideas of the field; perhaps, in Taylor’s view, this 
was a powerful method for gaining widespread acceptance in the social science 
community. 


7.4.4 The Significance of Taylor: Communicability and Impact 


Taylor’s description of sites of sociality within modelling communities brings 
an important point to bear. Even if a model is constructed in such a way that 
the modeller can justify its relevance to the broader empirical community, that 
community may be operating under a different understanding of what is important 
to display in a simulation, or the importance of simulations as a whole. 

Once again, we may use Vickerstaff and Di Paolo’s model as an example 
(Vickerstaff and Di Paolo 2005). Their model was accepted and communicated in 
a very popular experimental biology journal, despite being entirely computational 
in nature. The editors of the journal still accepted the paper despite its lack of hard 
empirical data; the model’s relative simplicity kept it from being too alien a concept 
for the biological readership, and the data it presented was novel and relevant despite 
its origins. The ability to comprehend the model in a more substantive way is vital 
to its acceptance; if the model had been too complex and difficult to replicate, the 
results would have been less impactful and interesting for the target audience. 

Also like Schelling, the nature of the model makes it very palatable to the 
experimental biology readership in a different manner: the model was adding to 
the discourse on a particular topic in biology, rather than attempting to make major 
alterations to the field. If an Alife researcher submitted a paper to an experimental 
journal that proclaimed an end to conventional insect biology as we know it, for 
example, then the editors are unlikely to be receptive. As with Schelling, this paper 
served to illuminate a new idea regarding an issue relevant to the community it 
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addressed; the paper did not ignore the pre-existing conceits of the community, and 
the data it produced was of value to the theories established in that community by 
decades of difficult data collection and analysis. 

Thus, Schelling and Vickerstaff and Di Paolo both show the importance of 
communicability in a model which seeks to engage the wider research community. 
Schelling had great impact due to the ease of communicating his model results, 
and the ease with which members of the social science community could repli- 
cate and engage with those results; similarly, Vickerstaff and Di Paolo achieved 
success by crafting a model which demonstrated relevant new ideas that could be 
communicated well to the biological community despite the differing methods of 
data collection. Moving forward, we will see how the simplicity of a model can not 
only assist communicability and impact for a given model, but also its tractability 
and ability to produce useful, analysable results. 


7.5 Fitting Schelling to the Modelling Frameworks 


7.5.1 Schelling and Silverman-Bullock: Backstory 


Under Silverman and Bullock’s framework (Silverman and Bullock 2004), 
Schelling’s model succeeds due to the integration of a useful theoretical ‘backstory’ 
for the model. Schelling represents the chequerboard model as an example of 
one particular micromotive leading to one particular macrobehaviour, in this case 
residential segregation. In the context of this backstory, Schelling is able to present 
his model as a suitable example of the impact of these micromotives on one 
particularly thorny social problem. 

By restricting his inquiry to this singular dimension, the model becomes more 
palatable to social scientists, as the significant abstractions made within the model 
facilitate the portrayal of this aspect of the phenomenon in question. This serves to 
emphasize Schelling’s original theoretical formulation by stripping away additional 
social factors from the model, rather than allowing multiple interacting social 
behaviours to dilute the evidence of a much greater role for individual preference in 
residential segregation. 


7.5.2 Schelling and Levins-Silverman: Tractability 


Under our clarified Levinsian framework from Chap. 4 (see Table 4.1), in which 
tractability forms the fourth critical dimension of a model alongside generality, 
realism and precision, Schelling’s model appears to fall into the L2 categorisation. 
The model sacrifices realism for generality and precision, producing an idealised 
system which attempts to illustrate the phenomenon of residential segregation in a 
broader context. 
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Meanwhile, tractability remains high due to the simplistic rules of the simulation 
as well as the general ease of analysis of the model’s results; for many social 
scientists, a simple visual examination of the resultant segregated neighbourhoods 
from a run of the simulation proves Schelling’s point fairly clearly. More in- 
depth examinations are also possible, as seen in Zhang’s assessment of the overall 
space of Schelling-type models (Zhang 2004); such assessments are rarely possible 
with more complex models, which involve much more numerous and complex 
parameters. 


7.5.3 Schelling and Cederman: Avoiding Complexity 


In Cederman’s framework (Cederman 2001, see Table 5.1), Schelling’s model is a 
C1 simulation, as it attempts to explain the emergence of a particular configuration 
by examining the traits of agents within the simulation. In this way the model 
avoids the thorny methodological difficulties inherent to C3 models, as well as 
the more complex (and hence potentially more intractable) agent properties of C2 
models. While C3 models are perhaps more useful to the social science modeller, 
due to the possibility of producing agents that determine their own interactions 
within the simulation environment, constraints such as those imposed by Schelling 
help to maintain tractability. This tractability does however come at the expense of 
increased self-organisation within the model. 


7.6 Lessons from Schelling 


7.6.1 Frameworks: Varying in Usefulness 


Having tied Schelling’s work into each of our previously-discussed modelling 
frameworks, a certain trend becomes apparent in the placement of Schelling’s model 
within each theoretical construction. The simplicity of Schelling’s model places it 
toward the extreme ends of each framework: it has an easily-defined theoretical 
backstory; lies well within the range of tractability under Silverman and Bullock; 
and falls firmly within the C1 category described by Cederman. 

While Cederman’s categorisation may help us to understand the aims and goals 
of a Schelling-type model, such ideas are already apparent due to the theoretical 
backstory underlying the model (which in turn places the model in good stead 
according our modified Levinsian framework). The pragmatic considerations of 
the model itself, as in whether it is amenable to analysis, are more important in 
driving our declaration of Schelling as a useful model. After all, a very ambitious 
and completely incomprehensible model could quite easily fall into Cederman’s 
C3 category; however, its impenetrable nature would be exposed to much greater 
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criticism under the more pragmatic views of Levins and our revision of Levins 
presented in Chap. 4. 


7.6.2 Tractability: A Useful Constraint 


As described earlier, Schelling’s model benefits in several ways from its notably 
simple construction. Referring back to our revised Levinsian framework, the general 
concern of tractability is mollified by the model’s inherent simplicity. Given that the 
model can produce a visual demonstration of a segregation phenomenon, the job 
of the analyst is made much easier; the qualitative resemblance of that result to an 
overview of a segregated, real-world neigbourhood already lends credence to the 
results implied by Schelling’s model. 

Perhaps more interestingly, the abstract nature of the model also makes further 
analysis less enticing for those seeking harder statistics. While the model does 
represent agents moving in space in reaction to stimuli, they do so as abstract units 
during time and space intervals that bear no set relation to real-world time and space. 
Fundamentally, the model seeks only to demonstrate the effects of these agents’ 
micromotives, and the effects of those micromotives on the question of interest; in 
that sense an in-depth analysis of the speed of segregation through the model’s space 
and similar measures, while interesting, are not necessary for Schelling to illustrate 
the importance of individual motives in residential segregation. As will become 
evident in the following section, Schelling’s ideas regarding modelling methodology 
drove him to construct his model in this way to maintain both tractability and 
transparency. 


7.6.3 Backstory: Providing a Basis 


Silverman and Bullock’s concept of the importance of a theoretical ‘backstory’ for 
any modelling enterprise (Silverman and Bullock 2004) seems supported by the 
success of Schelling’s work. His approach to modelling social micromotives derived 
from a theoretical backstory which takes in several important points: 


1) Within a given research discipline, there are non-obvious analytical truths which may 
be discovered by means which do not include standard empirical observation (specifically 
mathematical or computational modelling in this case). 


In the case of Schelling’s residential segregation model, the non-obvious result 
of the interaction of his tolerant agents is that even high tolerance levels still lead to 
segregation; one should note in this case that Schelling’s result was not only non- 
obvious in the context of the model, but was also non-obvious to those who studied 
the segregation phenomenon empirically. 
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2) The search for general models of phenomena can lead to the important discovery of 
general patterns of behaviour which then become evident in examples across disciplines. 


Schelling uses ‘critical mass’ models as an example (tipping models being a sub- 
class of these), arguing that they have proven to be useful in ‘epidemiology, fashion, 
survival and extinction of species, language systems, racial integration, jay-walking, 
panic behaviour, and political movements’ (Schelling 1978, p. 89). The explosion of 
interest in tipping models following Schelling’s success with residential segregation 
indicates that such inter-disciplinary usefulness may indeed be a crucial element to 
the success of his model. 


3) Modellers should seek to demonstrate phenomena in a simple, transparent way. As he 
states, a model ‘can be a precise and economical statement of a set of relationships that are 
sufficient to produce the phenomena in question, or, a model can be an actual biological, 
mechanical, or social system that embodies the relationships in an especially transparent 
way, producing the phenomena as an obvious consequence of these relationships’ (Schelling 
1978, p. 87). 


Schelling’s own models conform to this ideal, utilising very simple agents 
governed by very simple rules to illustrate the importance of individual behaviours 
in social systems. Zhang’s analysis of Schelling-type models using recent advances 
in statistical mechanics shows one of the benefits of this transparency in a modelling 
project (Zhang 2004). 


7.6.4 Artificiality: When it Matters 


The importance of artificiality within a simulation methodology as espoused by 
Silverman and Bullock (2004) is especially crucial to an evaluation of Schelling- 
type models. Schelling himself posits that models can illuminate important patterns 
in a system of interest without requiring recourse to empirical observation (Schelling 
1978); in this fashion his work suggests Silverman and Bullock’s Artificial’ and 
Artificial? distinction as a sensible path for models to take. However, he further 
clarifies this idea by proposing that such models must remain transparent and easily 
analysable, displaying a clear interaction which leads to the appropriate results in 
the system of interest; an overly-complex Artificial! model, in his view, cannot 
provide that clear link between the forces at play within the model and the resultant 
interesting behaviour. 

Further, Schelling’s two-part definition of models goes on to describe the 
potential for using an actual biological or social system in a limited context to 
illustrate similar points (Schelling 1978); this statement implies that Artificial? 
models, which would be a man-made incidence of these natural behaviours, may be 
able to provide that simplicity and transparency that empirical observation of such 
complex systems cannot provide. In one sense, then, Schelling appears to dismiss 
the question of artificiality in preference to the modellers motivations: in order to 
display the effect of micromotives or emergent behaviour, the model must display 
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a clear relationship between the resultant behaviour and the contributing factors 
alleged to create that behaviour, and whether that model then embodies Artificial! 
or Artificial? is not necessarily of any consequence. 


7.6.5 The Practical Advantages of Simplicity 


Schelling also demonstrates the practical usefulness of creating a simple model of 
a phenomenon. So far we have seen how this simplicity allows us to avoid some of 
the methodological pitfalls that can trouble those who choose to utilise agent-based 
models, and likewise it is easy to demonstrate how this same property can help the 
modeller in more pragmatic ways. 

Firstly, such simplicity not only allows for higher tractability, but also much 
simpler implementation. In the case of Schelling’s model, numerous implementa- 
tions of the model exist in a large variety of programming languages. Writing the 
code for such a simulation is almost trivial compared to more complex simulations; 
indeed, some pre-packaged simulation codebases such as RePast (designed for the 
social scientist) can be utilised to produce a Schelling-type model in only a few 
lines. Beyond simply the time savings of these ease of implementation, the simple 
construction of Schelling’s model vastly reduces the amount of time spent tweaking 
a simulation’s parameters. In more complex simulations, such as an evolutionary 
model, parameters such as mutation rates can have unexpected impacts on the 
simulation results, and finding appropriate values for those parameters can be time- 
consuming. 

Secondly, starting from such a simple base allows for much greater extendability. 
With the Schelling model being so easily implemented in code, extending some 
elements of that model becomes very easy for the researcher. For example, 
alterations in the number of agent types, the complexity of the agents themselves, or 
the set of rules governing the agents are easy to create. In addition, the simple nature 
of the initial model means it is also easy to change one aspect of the model and see 
clearly the results of that change; in a more complex formulation, changing one 
element of the simulation may produce unexpected changes, and complex elements 
in other areas of the simulation could be affected in ways that produce unanticipated 
results. 

Finally, the modeller benefits from potentially a much larger impact of the 
simulation when it is simple to implement. For example, we saw previously how 
Zhang was able to probe the properties of the entire space of Schelling-type models 
(Zhang 2004). If Schelling’s model were too complex, this would be an impossible 
task. Instead, due to its simplicity, dozens of interested researchers could implement 
Schelling’s model for themselves with little effort, see the results for themselves, 
and then modify the model or introduce new ideas almost immediately. Such ease 
of replication and modification almost certainly helped Schelling to reach such a 
high level of impact from his initial model; the simplicity of the model essentially 
lowers the barrier of entry for interested parties to join the discussion. 
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7.7 Schelling vs Doran and Axelrod 


7.7.1 Doran’s View: Supportive of Schelling? 


Doran’s views of agent-based models in social science (Doran 2000) proposes 
that such models can provide a means to generate new ‘world histories,’ or 
artificial explorations of versions of human society that may have come into being 
given different circumstances. While Schelling’s model could be construed as an 
‘artificial society’ of the simplest order, the model is oriented less toward reaching 
grand conclusions about the structure of society as a whole and more toward a 
demonstration of the low-level factors in a society which may produce one particular 
phenomenon. 

With this in mind, Doran’s further concerns about the undefined role of agents 
in social simulation also seem of particular import. Doran argues that agents in 
social simulation should be defined on a computational basis to allow those agents 
to develop emergent behaviour in the same way as other aspects of the simulation. 
Schelling’s model incorporates agents in a most abstract way; each individual in the 
model makes only a single decision at any given time-step, based on only a single 
rule. Given this simplicity, could we argue that these agents are sufficiently advanced 
to bring us a greater understanding of the residential segregation problem? If not, 
how might Doran’s view inform a more rigorous version of Schelling’s original 
vision? 

While Schelling’s model is indeed oriented toward a specific social problem in 
an intensely simplified form, in a sense he is providing a minimal “artificial society’ 
as Doran describes this methodology. Schelling is able to create alternate ‘world 
histories’ for the limited two-dimensional space inhabited by his simple agents; he 
can quite easily run and re-run his simulation with different tolerance values for 
the agents, and examine the resulting patterns of settlement following each change. 
For example, he could determine the result of populating a world with completely 
tolerant agents, completely intolerant agents, and all variations in between. 

With regard to Doran’s concerns regarding the roles of agents in social simula- 
tion, Schelling’s model suffers more under scrutiny. Despite the simplicity of the 
model itself, the agents are built to a pre-defined role: each agent is assumed to be 
searching for a place to settle, with its preference for a destination hinging upon 
the appearance of its neighbours. This presumes that individuals in a society would 
prefer to remain settled, rather than move continuously, and that all individuals will 
display a primary preference based upon the characteristics of its neighbours; both 
of these assumptions have been placed into the model by Schelling’s own conceptual 
model, rather than having those agent properties emerge from the simulation itself. 

One could imagine constructing a Luhmannian scenario in which agents are 
given only the most base properties: perhaps a means of communication, an ability 
to form preferences based on interactions, and the ability to react to its neighbours. 
Might these agents, based upon interactions with neighbours of different charac- 
teristics, form preferences independent of the experimenter’s expectations? If so, 
then these preferences would emerge from the simulation along with the resultant 
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agent configurations in the simulated world, making the agents more independent 
of the experimenter’s biases, though of course never entirely independent of bias. 
Such a set-up would certainly alleviate Doran’s concerns about pre-defined agents 
and theoretical biases, but whether the results would be more or less useful to the 
community than Schelling’s original is still debatable. 


7.7.2 Schelling vs Axelrod: Heuristic Modelling 


Robert Axelrod and Leigh Tesfatsion’s introduction to agent-based modelling 
for social scientists, discussed in the previous chapter, describes four main goals 
of social simulation models: empirical understanding, normative understanding, 
heuristics, and methodological advancement (Axelrod and Tesfatsion 2005). 
Schelling’s model seems to fall most readily into the heuristic category, seeking 
as it does a fundamental insight into the mechanisms underlying the phenomenon 
of residential segregation. 

Axelrod’s view, unlike Doran’s, stops short of examining specific methodological 
difficulties in social simulation modelling. Instead, he develops these four general 
classifications of model types, placing agent-based modelling into a framework 
oriented more toward empirical study. Given that three of Axelrod’s four categories 
are directly concerned with empirical uses of agent-based models, this framework 
offers little guidance as to the appropriate use of models like Schelling’s. 

Of course, as indicated by the discussion of Axelrod’s categorisations in the 
previous chapter, our analysis thus far has indicated a number of theoretical 
difficulties with this empirical approach to social simulation. With this in mind the 
fact that Schelling lies outside the prominent focus of Axelrod’s approach is not 
particularly surprising. Along with Doran, Luhmann, and Silverman and Bryden 
(Silverman and Bryden 2007), Schelling’s model is more appropriate in the context 
of a more general and abstracted modelling perspective, one which seeks general 
understanding of social phenomena rather than data relating to specific aspects of 
society. 


7.8 Schelling and Social Simulation: General Criticisms 


7.8.1 Lack of ‘Real’ Data 


While Schelling thus far has held up well under the scrutiny of several major 
theoretical frameworks regarding simulation models, there are still concerns levelled 
generally at social simulation which must be addressed. The first, and potentially 
the most troublesome, is the lack of ‘real’ data attributed to models within social 
science, and the resultant disconnection between these simulations and empirical 
social science. 
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Schelling quite clearly falls afoul of this methodological sticking point. There 
is no aspect of the chequerboard model which is based upon empirical data: 
the chequerboard itself is designed arbitrarily, with no real-world context or 
comparison; the agents are given a tolerance threshold by the experimenter, not one 
based upon any empirical study; and there is no representation of the nuances of 
human residential areas, such as buildings, other residents, or other interacting social 
factors that may effect the residential segregation phenomenon. In other words, 
Schelling’s model is very much an abstraction with no real basis in empirical social 
science. 

However, given the context of Schelling’s work, the abstraction is entirely 
justifiable. Had Schelling proposed to understand residential segregation in one 
particular circumstance, then produced such an abstract model, then he would 
have been reaching for conclusions far beyond the scope of the model itself. His 
question instead was much more broad: can we illustrate the effect of individual 
“‘micromotives’ on an otherwise difficult-to-explain social phenomena? His model 
answers this question without the need for specific ties to empirical data-sets, and 
indeed ‘real’ data would most likely dilute the strength of his result in this context. 


7.8.2 Difficulties of Social Theory 


Revisiting Kluver and Stoica once more, we recall their assertion that social theory 
does not benefit from being easily divisible into interacting hierarchical structures as 
in other fields, such as biology (Kliiver et al. 2003). Given that a social system will 
encompass potentially millions of individuals, each interacting on multiple levels, 
while higher-level social structures impose differing varieties of social order, the 
end behaviour of a society through all of these factors can be exceedingly complex. 
This inherent difficulty in social science makes the prospect of empirically-based 
simulation seem ever more distant. 

However, Schelling once more illuminates the benefits of a more abstract 
approach to examining social systems. Schelling’s model suffers little from the 
problem of interacting social structures, as the model itself involves only a set of 
agents interacting in a single space: there are no social structures; no imposed social 
order in the system; and only a single type of interaction between agents. The model 
thus escapes this difficulty by quite simply eliminating any social structures; without 
these multiple interacting structures to confound analysis, the model’s result remains 
clear. 


7.8.3 Schelling and the Luhmannian Perspective 


With the abstraction and simplicity of Schelling’s model allowing it to escape 
from the methodological traps common to most social simulation endeavours, 
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we are left with an interesting perspective on our earlier proposed method for 
developing fundamental social theory. Similar to Schelling’s tipping model, the 
Luhmannian modelling perspective would construct an agent-based model bearing 
as few assumptions as possible, allowing the resultant configurations of agents and 
model properties to emerge of their own accord. 

In essence, Schelling’s model takes the Luhmannian approach and boils it down 
to an exploration of a single aspect of human society. While Luhmann asks what 
drives humanity to develop social order (Luhmann 1995), Schelling restricts his 
domain to only residential segregation, asking whether individual motives can drive 
agents to separate themselves from one another. While Luhmann condenses the 
whole of human interaction down to social developments linked to an iterated 
process based on our ‘expectation-expectations,’ Schelling similarly condenses the 
puzzling behaviour of human segregation down to a series of simple individual 
decision-making steps. 

Schelling’s success, then, gives the Luhmannian approach a further emphasis. 
While the investigation of the overall origins and development of the human social 
order is certainly a much more complex endeavour than that of investigating a 
single social phenomenon a la Schelling, this residential segregation model provides 
an insight into the benefits of the process. Schelling wrote of the importance of 
demonstrating the relationships between model properties transparently (Schelling 
1978), and with a Luhmannian model the same approach is necessary. 


7.8.4 Ramifications for Social Simulation and Theory 


Having established the importance of Schelling’s perspective, and the links between 
his modelling paradigm and our proposed means for developing fundamental social 
theory, a further re-evaluation of Schelling’s impact on social simulation is required. 
We have seen the import of this model’s simplicity and transparency, and even how 
its inherent abstraction enables the model to draw intriguing conclusions within the 
larger context of general social theory. Does this imply that the empirically-based 
approach of Cederman, Axelrod and others is a scientific dead-end? 

Perhaps not: certainly as the field of geographic information systems (GIS) 
continues to advance, and real-time data collection within social science becomes 
more wide-spread and rigorous, then the introduction of real and current data 
into social simulation becomes an interesting possibility. This after all is a central 
criticism of social simulation of this type: real data is hard to come by, and that 
data which is available is often limited in scope or out of date. When this obstacle 
disappears, then simulations more closely linked to data produced by real human 
society becomes more viable. 

However, the other fundamental concerns related to social simulation still 
remain. Doran’s concerns regarding the lack of definition of the roles of agents 
in social simulation (Doran 2000) remain important, and as integration with real 
data becomes vital to a simulation that concern becomes ever more central. The 
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question of how to develop cognitive agents with an appropriate set of constraints 
and assumptions to produce parsimonious results is not one that will be answered 
quickly. 

Similarly, the inherent abstraction of social processes and structures necessary 
in a computer simulation could be problematic in a simulation designed to produce 
empirically-relevant results. Axelrod and Tesfatsion (2005) proposes social simu- 
lations that may inform public policy (his ‘normative understanding’ category of 
social simulations), and while this is an enticing prospect, we have already seen the 
troubling pitfalls the researcher can encounter in this approach (see Chaps. 4 and 5). 

This reinforces the importance of Schelling’s model as deconstructed in the 
preceding analysis. While initially appearing simplistic, the theoretical implications 
of Schelling’s work are rather more complex. The importance of this model in social 
science despite its simplicity, complete abstraction, and lack of empirical data shows 
the potential of social simulation to stimulate social theory-building. Schelling’s 
model stimulated the field to view the importance of individual ‘micromotives’ in a 
new fashion; a more sweeping model portraying the emergence of a social order in 
a similar way could have an equally stimulating effect on social theory. 


7.9 The Final Hurdle: Social Explanation 


7.9.1 Schelling’s Model and the Explanation Problem 


As described in Chap. 5, there is some debate over the explanatory power of com- 
puter simulation within the social sciences. While the predominant idea within such 
endeavours is that of emergence, or the development of higher-level organisation in 
a system given interacting lower-level components, some theorists argue that social 
systems do not produce higher-level organisation in this way (Sawyer 2002, 2003, 
2004). 

Sawyer’s development of the idea of non-reductive individualism, in which 
certain properties of a society cannot be reduced to the actions of individual actors 
in that society, does pose a fundamental problem for agent-based modellers. If 
agent-based models proceed on the assumption that individual-level interactions 
can produce the higher-level functions of a social order purely through those inter- 
actions, then such models may be missing a vital piece of the explanatory puzzle 
within social science. In this respect individual-based models need a theoretical 
underpinning in which emergence is a valid means for the development of high- 
level social structures. 

Schelling’s model focuses entirely on the actions of individual agents, and the 
impact of their individual residential preferences on the racial make-up of a given 
neighbourhood. In this sense the model does seek to demonstrate the emergence of 
a higher-level configuration from the actions of individuals, which according to this 
view of social explanation is problematic. 
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However, Macy and Miller’s take on this social explanation problem (Macy and 
Willer 2002) would allow for Schelling-type models, as the model focuses purely on 
a situation for which there is no central coordination of the behaviour in question. 
The agents in Schelling’s chequerboard world do act individually, but the results he 
sought were not a higher-level social organisation or function, but instead merely 
a particular configuration of those individuals at the end of a simulation run. Even 
in Sawyer’s less forgiving view, Schelling’s simulation does not strive to explain a 
larger-scale structure that might be dubbed irreducible. 


7.9.2 Implications for Luhmannian Social Explanation 


Our earlier discussions of Niklas Luhmann’s ideas regarding the development of 
the human social order presented a means for applying these ideas to agent-based 
models designed to examine the fundamentals of society. By applying Luhmannian 
ideas of extremely basic social interactions that lead to the formation of a higher 
social structure, a modeller may be able to remove the theoretical assumptions often 
grafted into a model through the design of restrictive constraints on agents within 
that model. 

However, our examinations of the difficulty of social explanation remain prob- 
lematic for a Luhmannian approach. If certain high-level aspects of human society 
depend on functions which are irreducible to the properties of individuals within 
that society, then even the most cleverly-designed agent-based model would be 
unable to provide a complete explanation for the functioning of society. This places 
a fundamental limit on the explanatory power of the Luhmannian approach, barring 
us explaining the social order in its entirety. 

Perhaps Sawyer’s comparisons with non-reductive materialism within the phi- 
losophy of mind may provide a solution (Sawyer 2004). As he notes, ‘the science 
of mind is autonomous from the science of neurons,’ alluding to the disconnect 
between psychology and neuroscience (Sawyer 2002). Indeed, within the study of 
mind there are conscious and unconscious phenomena which are irreducible in 
any straightforward way to the functioning of the underlying neurons; after all, 
psychologists still struggle to explain how neuronal firings lead to the subjective 
experience of individual consciousness. 

However, very few psychologists still adhere to the concept of dualism: the idea 
that conscious experience is a separate entity from the physical brain itself. Sawyer 
clearly does not, as is evident from his discussion of non-reductive materialism. In 
that case, while we may not be able to draw a direct relation between conscious 
phenomena and brain activity, the relation nonetheless exists; neurons are the cause 
of conscious phenomena, merely in a way we cannot understand straightforwardly. 

In the same sense, individuals will be the fundamental cause of large-scale social 
phenomena, whether those phenomena display clearly evident relationships or not. 
If we accept that a sufficiently advanced and appropriately constructed computer 
may be able to achieve consciousness, then surely a similarly advanced social 
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simulation could allow sophisticated structures to emerge as well? In either case, the 
functioning of those individual units would be very difficult to relate to the emerging 
high-level phenomena but developing artificial (and hence de-constructable) models 
of both these processes could provide a unique insight. 

Thus, Sawyer may very well be correct, and understanding the relation between 
high-level social structures or institutions to individuals in society may be difficult, 
or even impossible. But this is not to say that individuals, even in a model, could 
not produce such phenomena; nor that study of such models would not produce any 
insight. The Luhmannian modelling perspective aims to discover the roots of human 
society, and if those individual roots produce irreducibly complex behaviour, those 
models have surely functioned quite well indeed. 


7.10 Summary and Conclusions 


In this chapter we have examined one particular example of social simulation, 
Schelling’s chequerboard model, in light of the various theoretical concerns raised 
thus far. Schelling’s model was a marked success in the social sciences, producing 
an endless stream of related works as the idea of social phenomena emerging from 
the actions of individuals grew in the social sciences as it did in artificial life. 

The simplicity and transparency of the model allowed it to have a stronger impact 
than many more complex models. Along with being easily reproducible, Schelling’s 
results were restricted to a very specific question: can individual housing preferences 
drive the mysterious process of residential segregation? The answer, demonstrated 
by the starkly-separated patterns of white and black squares so prominent in the 
literature, appeared to be yes. 

This approach, while falling within the remit of the theoretical frameworks laid 
out by other social simulators, also provides an insight into new paths for producing 
social theory. While the use of social simulations for empirical study is enticing, and 
potentially both useful and lucrative, the methodological and theoretical difficulties 
involved in such an approach are many and complex. In contrast, a simple model 
with few inherent assumptions can offer a more basic and general description of the 
origin of various social phenomena. 

Schelling’s model also provides a view into the potential benefits of the proposed 
Luhmannian modelling approach for the social sciences. Like Luhmann, Schelling’s 
model took very basic interactions as a basis for producing more complex social 
behaviour; Luhmann takes a very similar approach, but on a much larger canvas. In 
this respect Schelling shows that the Luhmannian approach could allow social the- 
orists to develop ideas regarding the social order that may have large ramifications 
for social science as a whole. 
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Of course, the problem of social explanation still looms as large for Luhmann as 
it does for Cederman or Axelrod. The debate over emergent phenomena in social 
science is unlikely to subside, and as such the results of such simulations may 
always be disputed on some level. However, even if we accept that some aspects of 
a Luhmannian simulation may not provide complete explanatory power, the results 
could still be revolutionary for social theory. 
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Chapter 8 
Conclusions 


8.1 Overview 


The previous chapter focused on demonstrating our developed modelling 
frameworks in the context of one particular case study: Schelling’s well-known 
residential segregation model (Schelling 1978). Despite this model’s inherent 
simplicity, the results were seen as significant within social science. Our analysis 
of the methodological and theoretical underpinnings of Schelling’s model provided 
some insight into how his simple model of societal micromotives became so 
influential. 

However, Schelling need not be the only example of a successful computational 
modelling endeavour. While Schelling does fare well when viewed in the context 
of a number of different modelling frameworks, there are other examples of 
computational research which can provide useful results through varying means. 

By revisiting our central bird-migration example, and viewing each of our 
developed modelling frameworks from the first two sections of this text in the 
light of our analysis of Schelling, we can put together a more comprehensive set 
of ideas regarding the limitations of computational modelling. The effect of such 
ideas on substantive modelling works are also important to discuss; with these 
methodological frameworks in place, research utilising computational modelling 
will necessarily need to adapt to these restrictions. 


8.2 Lessons from Alife: Backstory and Empiricism 


8.2.1 Backstory in Alife 


Our analysis of Alife in Chap.3 focused first on the distinction between two 
proposed varieties of artificiality: Artificial!, a man-made example of something 
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natural; and Artificial’, something made to resemble something else. This 
distinction proved most important in the contrast between strong Alife and weak 
Alife: strong Alife seeks to create systems that are Artificial! in nature, while weak 
Alife seeks only Artificial? systems. 

Along with the drive to create digital examples of life, members of the Alife 
community have sought to use their simulations as a means for providing new 
empirical data points useful for studying life. Such an empirical goal is far from 
trivial, and requires a cohesive theoretical backstory to provide a basis for allowing 
that data to be used as such. As described in Sect. 3.6, our PSS Hypothesis for Life 
provides one example of such a perspective: if one accepts that life is an information 
ecology, then a suitably-programmed computer can also instantiate such a system. 

However, a backstory of this nature requires additional philosophical baggage. 
While a researcher may utilise this PSS Hypothesis to justify investigations into a 
digital living system, that system still exists only in a virtual form. The researcher 
becomes a digital ecologist, studying the output of the simulation for its own sake. 


8.2.2 Schelling’s Avoidance of the Issue 


Our analysis of Schelling provides a means for escaping these philosophical 
conundrums. Rather than proposing a model which is an Artificial! instantiation 
of human society in order to explore residential segregation, he produces a simple 
model which only illustrates an example of his concept of micromotives and their 
effect upon society (Schelling 1978). 

A version of our bird-migration model could take advantage of simplicity in a 
similar way. If we constructed a model which presented each bird as only a simple 
entity in a grid-world, with simple rules which drives the movements of those 
birds, we may be able to present an example of how singular micromotives could 
drive birds to shift from one location to another (say by moving according to food 
distribution or other factors that may contribute to our bird agents’ well-being under 
the given rule set). In such a formulation the question of Artificial! versus Artificial” 
is unimportant; the model is obviously Artificial? in nature, and no claim is made to 
be creating real, digital bird equivalents. The model simply seeks to show the impact 
of Schelling-esque micromotives in bird migration. 

However, while avoiding the problem of artificiality can be advantageous to the 
modeller, the question of relating that model to empirical data still remains. Without 
claiming that the agents within our bird model are instantiations of digital life, we 
cannot be said to collect empirical data on those agents. Might we remain in danger 
of constructing an artificial world which we proceed to study as a separate entity 
from the real systems of interest? 
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8.3 The Lure of Artificial Worlds 


8.3.1 Levinsian Modelling 


Our examination of the more pragmatic concerns of modellers through the lens 
of population biology provided some additional considerations for the simulation 
researcher. Levins’ (1966) developed a framework in which three dimensions of 
generality, realism and precision, important to any given model of a natural system, 
must be balanced effectively to produce a useful model. He posits that a modeller 
can focus on two of these modelling dimensions at a time, but only at the expense 
of the third; this leads him to describe three possible varieties of models, which we 
denoted L1, L2 and L3 (see Table 4.1). 

As noted in Chap. 4, however, applying this framework to certain computational 
models can be difficult. If our bird migration model uses a highly-simplified set of 
evolving agents to represent birds, and places those agents in a simplified world with 
abstracted environmental elements to affect those agents, how might we characterise 
that model under Levins’ framework? Certainly precision cannot apply, as there is 
no attempt in this formulation to match these agents with any particular real-world 
species. Nor does realism seem to apply, as the highly-abstracted nature of the model 
divorces it from the natural systems it seeks to model. Can the model be referred to 
as simply general in character, or even then is the model seeking insights which may 
end up generalising in a different fashion than in other varieties of models? 


8.3.2 Levins and Artificial Worlds 


This difficulty leads us to the question alluded to in the previous section: do we 
risk becoming mired in the study of artificial worlds for their own sake? Certainly 
our proposed simulation above would suffer from a lack of available empirical 
data to use in the model. For example, by using evolving agents to represent 
the birds we can model the progression of this abstracted species over time, but 
empirically-collected data following a species through its evolution is not available 
to provide guidance for this aspect of the model. The modeller can proclaim a 
serious advantage in being able to model something which is impossible to study 
empirically through only a relatively modest investment of programming time and 
processing time, but likewise, the lack of relevance to biology becomes ever more 
acute (once again, see Webb 2009 for a discussion of this issue in relation to Beer’s 
CTRNN model). 

In a certain way, however, this lack of empirical relevance is an attractive 
feature for the simulation researcher. Replacing the complexities of the real world 
with simplified abstractions in a wholly artificial world not only makes model 
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construction potentially easier, but even avoids the practical difficulties often 
associated with traditional empirical methods. As a consequence, our bird model 
need not be tied down by Levins’ pragmatic modelling concerns, and balancing his 
three dimensions of model-building suddenly appears much easier. 

Of course, as discussed in Chap. 4, such artificial worlds create further method- 
ological difficulties of their own. While such a model may appear to avoid Levins’ 
concerns and thus produce more complete models of phenomena, confining that 
model to an artificial world creates a strong separation from the empirical data that 
can inform other varieties of models. The modeller is thus in danger of studying the 
model as a separate entity from the systems on which it is based. 

Schelling avoids this difficulty by positioning his model as a means to illustrate 
the importance of micromotives in social behaviour (Schelling 1971, 1978). While 
his model does relate to a real system, and it does take place in a highly idealised 
artificial world, within this context the model does not need a strong relationship to 
empirical data. Schelling instead strives for transparency maximising tractability by 
creating an abstract, easily computable model. The question of relating the model 
to the real system thus becomes simplified: can the model illustrate the potential for 
individual micromotives to exert a great influence on a society? The answer, for most 
social scientists, appears to be yes, despite the model’s artificiality and simplicity. 


8.4 Modelling in Alife: Thoughts and Conclusions 


8.4.1 Lessons from Schelling 


As seen in the analysis above, there are a multitude of considerations relevant 
to the ALife modeller. From crafting a suitable theoretical backstory to avoiding 
the difficulties of artificial worlds, methodological problems are hard to avoid 
completely. Schelling provides some insight into how to approach these issues. 
The simplicity of the model allows for a coherent theoretical backstory, focusing 
only on the possible effects of micro-motives on the larger system. Meanwhile, the 
model’s transparency maintains tractability, though this brings with it a high level 
of artificiality in the model. 

As we see with Schelling, however, this artificiality is not necessarily the 
problem. The larger concern is the intent with which the model is constructed. 
A strong ALife practitioner who seeks to create digital life needs to demonstrate 
Artificial', and in this case he would presumably require a much higher degree of 
complexity than in Schelling’s model; even with the PSS Hypothesis for Life as a 
backstory, a grid-world of simplistic homogeneous agents could hardly be said to 
compose an information ecology. 
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8.4.2 The Power of Simplicity 


The circumstance of being driven away from simplicity toward complexity in this 
search for Artificial! also creates a much more difficult pragmatic situation for the 
modeller. As noted in Chap. 7, simple models like Schelling permit the modeller to 
spend far more time theorising than tweaking. For our bird researcher, a situation in 
which a few simple lines of encode provide the agent’s behaviour is preferable to one 
in which each agent contains complex neural network elements, for example. In the 
first case, the researcher can write the code and run the simulation quickly, then take 
the necessary time to examine results, run alternative versions of the simulation, and 
so forth. In the second case, the researcher could spend far more time tweaking the 
individual parameters of the agent structures: what sort of sensors might each agent 
have? How do the neural networks control movement and receive input from the 
environment? How many neurons are there, and what type are they? The questions 
become many and varied for the researcher using complex agents, and each day 
spent tweaking those agent parameters is one less day spent pondering the results of 
the simulation. 

As seen above, Schelling’s model provides one example of a means to avoid 
these pragmatic issues. His model is of such simplicity that writing a version of his 
simulation takes only a few lines of code compared to more complex simulations. 
However, clearly other types of models can maintain similar simplicity; Beer, 
for example, touts his CTRNN-based agents as displaying ‘minimally-cognitive 
behaviour’ (Beer 2003a,b). Of course, Beer’s analysis of that minimally-cognitive 
behaviour is extremely detailed and time-consuming, and may indicate that such 
analysis is impractical for agents of that type even of such relative simplicity. 
Nonetheless, Beer does demonstrate that relatively simple and analysable neural 
models are not outside the realm of possibility. 


8.4.3 The Scope of Models 


With all of these points in mind, we see that Schelling-type models are hardly 
the only permissible variety under these frameworks; in fact, a large number of 
models may display appropriate theoretical back-stories while remaining tractable. 
Schelling does, however, illuminate the central concerns tied to these modelling 
frameworks: the importance of theoretical backstory, artificiality, tractability and 
simplicity, and the scope of the model. 

The final element, scope, is an important one to note. Schelling succeeds not only 
by having a cogent backstory, using its artificiality appropriately, and remaining 
tractable, but also by limiting its approach: the model aims only to illustrate the 
importance of micro-motives in a social system, not produce empirical data. In the 
same way, if our bird migration researcher chose to model the movements of the 
birds from place to place simply for the purpose of realistic mimicry of their travels, 
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as in Reynolds’ flocking boids (Reynolds 1987), he could do so with little theoretical 
baggage. If he then chose to declare these mimicked movements as instances of real 
flocking, or as producing relevant empirical data, suddenly far more justification is 
required. 


8.5 The Difficulties of Social Simulation 


8.5.1 Social Simulation and the Backstory 


While our earlier analysis of ALife provided some valuable insight into the limi- 
tations of agent-based modelling techniques, particularly in contrast to traditional 
mathematical models, these refined frameworks developed in those chapters do 
not translate simply to social simulation approaches. Within a field such as social 
science, the philosophical considerations we addressed in those frameworks grow 
even more troublesome. Imagine that we have created our bird migration model, and 
the in-depth construction of our programmed agents allows those agents to begin to 
communicate and even form social structures of a sort. If our stated goal is to use this 
model to investigate properties of human societies by providing a new, digital source 
of empirical data, we must not only accept the PSS Hypothesis (presuming that only 
living things may create a true society), but also related points. The researcher must 
be prepared to accept that these digital beings, alive but unrelated in a conventional 
sense to natural life, can create a real societal structure. 

In addition, this societal structure will be further removed from real, natural 
societies by its dependence on a form of ‘life-as-it-could-be.’ In that case, even if we 
accept that this virtual community of birds can create a society of their own through 
their interactions, how might we be able to relate that behaviour to the development 
of real-world societies? If we accept Sawyer’s concerns regarding the non-reductive 
individualist character of human society (Sawyer 2002, 2003, 2004), are we not 
placing ourselves even further from a possible social explanation by basing theories 
upon this digital instantiation of society? Perhaps the non-reductive characteristics 
of our digital society differ completely from those displayed in human society. Once 
again, we would be stuck studying the simulation for its own sake. 


8.5.2 Social Simulation and Theory-Dependence 


As discussed in Chap. 3, the field of ALife may be considered fundamentally theory- 
dependent: the structure and function of a given simulation is directly connected to 
the theoretical assumptions made to create that model. The framework discussed in 
that chapter chose to deal with this issue by noting the inherent theory-dependence 
of empirical science, and presenting ideas regarding the importance of theoretical 
back-stories to any empirically-motivated endeavour. 
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With social simulation, however, an additional layer is added to this theory- 
dependent aspect of modelling. Not only do the agents and environment of the 
simulation present problems of theory-dependence, but also the additional aspects of 
social communication and behaviour that allows the model to address issues relevant 
to society. 

Further, these additional layers of complexity are not easily subdivided into a 
hierarchy or other framework which may ease the construction of a computational 
model (Kliiver et al. 2003). The interdependencies between individual action and 
societal effects means that abstract models will lose a great deal of detail in 
these missing elements, and conversely that highly-detailed models which address 
these complexities will veer toward intractability. In either case, theory-dependence 
becomes a greater issue: abstract models will require strong assumptions to remove 
these non-hierarchical complexities; and complex models will require incorporating 
ideas regarding the interaction of individuals and society. 


8.5.3 Social Simulation and Explanation 


Even if one does manage to construct a tractable social simulation with reasonable 
theoretical justification, as noted by Sawyer (2004), social explanation via social 
simulation is a difficult prospect. While the potential for agent-based models to 
demonstrate the emergence of social complexities is a possible benefit to social 
science, whether or not those models can provide a coherent explanation of social 
behaviour is unclear. 

Sawyer argues that societies, like the human mind, display qualities which 
are irreducible to simple interactions of individuals within that society; despite 
our knowledge of neuroscience, the higher-level study of mental phenomena (i.e., 
psychology) is still required to understand the human mind. Similarly, Sawyer 
argues that human society displays a non-reductive individualism, in which social 
explanation cannot be complete without addressing the irreducible effects of higher- 
level social structures. The variation of the bird migration example given in 
Sect. 5.9.2 gives one example of this phenomenon. 


8.6 Schelling’s Approach 


8.6.1 Schelling’s Methodological and Theoretical Stance 


Schelling’s modelling approach, as in our analysis of ALife, provides some 
important insights into addressing the difficulties of social simulation. Once again, 
Schelling avoids some of the difficulties facing other social simulations by virtue 
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of the residential segregation model’s simplicity. As a C1 model in Cederman’s 
framework (see Table 5.1 for a summary), Schelling avoids the methodological 
difficulties of using complex agent structures (as in C2), or seeking profound 
emergent behaviours (C3). Likewise, problems of theory-dependence are minimised 
by using only simple agents with simple rules, with no attempt to address larger 
social structure. As a consequence, Schelling’s methodology remains influential 
today with researchers who seek simple models of social phenomena (Pancs and 
Vriend 2007; Benito 2007). 

Schelling further strengthens this approach through his own theoretical frame- 
work regarding the best use of models. He posits that general models of behaviour 
can produce general insights, and that within a given discipline some such insights 
may not be evident from empirically-collected data. This view seems vindicated by 
the surprising result of his segregation model, the impact of which was immediate 
and lasting within social science. 


8.6.2 Difficulties in Social Explanation 


However, while Schelling’s model does address a number of concerns relevant to 
social simulation, the problem of social explanation still presents a difficulty. As 
noted above, his model takes no interest in the presence or influence of higher levels 
of social structure; he is concerned only with the actions of individuals. Within the 
study of residential segregation, his result is likely to be only a partial explanation 
simply for that reason: housing reform, tax incentives, and other measures designed 
to influence racial integration in the housing sector are likely to have an impact 
as well. Schelling of course does not strive for such a complete explanation, as 
discussed in Chap. 7; however, those who do seek an explanation of residential 
segregation could try to use Schelling’s model as a starting point. 

Unfortunately, Sawyer’s perspective argues that avoiding these higher-level 
elements in an agent-based model and hoping for them to emerge may be fruitless 
as well (Sawyer 2004). Even if one were to add additional capabilities and com- 
plexities to Schelling’s model, in the hope of allowing for more complex emergent 
behaviours, an explanation derived from such a model would lack the contributions 
of higher-level, irreducibly-complex social institutions. As in the birdsong example 
in Sect.5.9.2, there is an argument that such elements must be incorporated to 
produce a complete picture of the development social behaviours. The issue of 
how to progress from Schelling’s initially successful modelling framework to deeper 
social explanation thus remains a complex one. 

At least the social scientist can take solace in the fact that such concerns are not 
alien to other modelling disciplines, as in Bedau’s discussion of ‘weak emergence’ 
as a means for avoiding the difficulty of the effect of downward causation in natural 
systems from higher-order elements (Bedau 1997). Unfortunately, if anything such 
difficulties are more acute in social simulation than in biologically orientated 
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simulation, as the additional elements of social institutions, mass communication, 
and other distinctly societal factors add additional layers of unknowns into an 
already difficult theoretical situation. 


8.6.3 Schelling and Theory-Dependence 


As noted above, the issue of theory-dependence looms large within social simu- 
lation, and even for Schelling’s simple and abstract model these problems seem to 
remain. Schelling’s model is constructed as a singular, vitally important assumption: 
if we presume that individuals choose their preferred housing based on the racial 
makeup of their neighbourhood, then segregation will result. 

Schelling, and likely other theorists, would argue that such an approach is 
commendable: Schelling was using his model to test a hypothesis, which is an 
acceptable role for models. By introducing a highly tractable, transparent model to 
illustrate the potential import of these factors in residential segregation, he was able 
to present a new perspective on potential individual choices that can lead to such 
undesirable social outcomes. However, such an approach becomes difficult when 
the goal is not hypothesis-testing, but the development of new social theory. 


8.7 Luhmannian Modelling 


8.7.1 Luhmann and Theory-Dependence 


Luhmann’s influential treatises on the development of social order (Luhmann 
1995) provide a perspective which illuminates potential methods to use social 
simulation to develop social theory. His ideas regarding the low-level basis for 
human communication, and the subsequent development of social order, seems a 
natural partner for the agent-based modelling methodology. 

This approach demonstrates the fundamental limitation of Schelling’s methodol- 
ogy alluded to in the previous section. While the residential segregation model can, 
and does, provide a useful test of a hypothesis which demonstrates the importance 
of individual behaviour in human society, the overall import of that factor is 
unaddressed by such a model, given that it is confined to a singular behavioural 
assumption related to a singular social problem. We may build a Schelling-type bird 
migration model which demonstrates the importance of certain individual-based 
factors in driving migration behaviour, but the deeper question remains open: how 
do these behaviours arise in the first instance? 

For the social scientist, these are questions that must be addressed in order to 
develop a deeper understanding of the origin and development of human society. 
While we can imagine innumerable scenarios in which a Schelling-type model may 
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illuminate certain singular aspects of social issues and problems, the simplicity 
and related theory-dependence of the approach limits our ability to investigate the 
fundamental beginnings of society and communicative behaviour. 


8.7.2 Luhmann and New Social Theory 


Luhmannian modelling can address this larger goal of social simulation, clearly 
a larger goal of the research community (Cederman 2001; Axelrod 1997; Epstein 
1999), by removing these elements of theory-dependence. Any model constructed 
based upon the functioning of human society will incorporate a fundamental 
theoretical bias, and remove the possibility of investigating the factors that lead our 
society developing in that way initially. 

Doran’s perspective regarding the undefined nature of computational agents in 
social science (Doran 2000) ties in closely with our developed Luhmannian view. 
Doran argues that social simulations should begin below the level of agent, allowing 
for the emergence of agent structures that interact without pre-existing theoretical 
biases affecting those interactions. In both cases, the removal of theory-dependence 
from the model is paramount to its success in providing insight for the social 
scientist into the origin of society. 


8.7.3 Luhmann and Artificial Worlds 


Recalling the earlier discussion of Levins (1966, 1968) and the lure of artificial 
worlds for the modeller, this Luhmannian approach seems to fall squarely within 
this realm. After all, a simulated environment, not based upon empirical data, which 
includes abstracted virtual agents is already quite separated from the traditional 
modes of empirical study. A simulation in which most elements of existing theory 
and data are removed to study the fundamentals of society seems even further 
away from the real world; one could easily imagine such a model producing quite 
unusual agents with idiosyncratic interactions. Once again we return to the prospect 
of studying a model for its own sake, removed from the natural world. 

Where the Luhmannian approach is unique in this regard is the way in which 
this separation from the real world is vital to the model. The search for a 
fundamental social theory requires the removal of pre-existing theory-dependent 
elements in order to produce models which illuminate the importance of pre-societal 
communications and interactions. In essence, the Luhmannian approach utilises an 
artificial world to illuminate factors in the real world that we may miss, simply 
by virtue of our pre-existing biases derived from forming our models and theories 
within a society. 
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8.8 Future Directions for Social Simulation 


8.8.1 Luhmannian Modelling as a Way Forward 


Clearly the Luhmannian modelling approach displays promise for those in search of 
new social theory. Without the removal of more theoretical bias from future social 
simulations, issues of theory-dependence will continue to provide difficulty for the 
social scientist who hopes to use these methodologies. Likewise, this approach 
offers a wide scope for developing new ideas regarding the origin of society, 
something not provided by Schelling-type methods. 

Perhaps more importantly, this perspective brings us closer to the larger goals 
expressed by proponents of social simulation: a means for understanding the 
detailed workings of human society. From Levins’ enthusiasm for L3 models 
(Levins 1966) to Cederman’s for C3 models (Cederman 2001), researchers in 
various fields continue to seek to explain higher-level behaviour through the interac- 
tions of component parts following simple rules. For the social scientist, Luhmann 
reduces social interaction to its simplest components, giving the community access 
to a new and potentially stimulating view of the earliest beginnings of human 
society. 


8.8.2 What Luhmann is Missing: Non-reductive Individualism 


Despite these promising elements of the Luhmannian approach outlined thus far, 
the problem of social explanation remains. Sawyer’s perspective of non-reductive 
individualism in social science (Sawyer 2002, 2003, 2004) implies that even an 
elegant portrayal of the origins of society may not be able to allow for the emergence 
of complex social structures. Given that some of these structures are irreducible to 
the actions of component individuals in the society in question, there may be great 
uncertainty as to whether a given set of Luhmannian rules for a simulation may be 
able to produce such complexity. 

Perhaps, then, our comparison with the study of the human mind is more apt 
that initially thought. As Sawyer notes, there is both a study of mind and a study 
of neurons (Sawyer 2004); likewise, such an approach is an option for social 
simulation. Analogous to the study of neurons, Luhmannian approaches can probe 
the low-level interactions of individuals in a society through the use of models. 
Then, analogous to the study of mind and mental phenomena, other models may 
probe the influence of higher-level organisation upon those low-level agents. A 
combination of these approaches could provide insight into social science that, as 
Schelling describes, may be unattainable by other means (Schelling 1978). 

For example, Luhmann’s discussion of the function of social order (Luhmann 
1995) includes an in-depth discussion of the major elements of the modern social 
institutions which pervade most human societies. This in turn inspired a model in 
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which each of those institutions was modelled very simply, as a monolithic influence 
on a society of agents, with each institution affecting one element of overall agent 
behaviour (Fleischmann 2005). Thus, this model attempts to capture the “science 
of mind’ level of societal interaction, while also incorporating lower-level agent 
behaviours. Such a model shows great promise, as Sawyer’s objections and the 
difficulties of strong emergence hold much less weight if these downward causation 
effects can be harnessed appropriately in a model. 


8.8.3 Integrating Lessons from ALife Modelling 


As our analysis of Schelling shows, the frameworks in the earlier chapters that 
underwrite certain varieties of ALife models can be brought to bear on social 
simulations as well. The issue of theory-dependence has proved vital enough to the 
success of social simulation that the latter chapters of this text aimed to develop 
a modelling framework which removes that difficulty. Similarly, the pragmatic 
concerns of generality, realism, precision and tractability will remain important 
even in a Luhmannian approach which aims to develop social theory; our modified 
Levinsian framework provides a useful guide to the limiting factors present in all 
models of natural phenomena. 

A larger question in both ALife and social simulation is illustrated neatly by 
Schelling’s model and related theoretical justifications: how does the intent and 
scope of a model affect its usability, either empirically or theoretically? Schelling 
shows how a simple model, intended to demonstrate the importance of a singular 
factor in a singular problem, can have wide-ranging effects on related theory. 
Despite using agents that followed only a single rule, Schelling’s demonstration 
of the potential impact of individual micro-motives on the segregation problem lead 
to a great deal of research investigating the impact of such individual factors on all 
varieties of social problems. 

Likewise, our analysis of ALife demonstrated the importance of theoretical 
backstory in driving the acceptance of a model as a contribution to empirical data 
within a discipline. Empirical science is full of such back-stories, but they are 
implicit: the trans-cranial magnetic stimulation researcher believes tacitly that such 
methods are analogous to producing real lesions in a patient’s brain. In contrast, 
the agent-based modeller deals with artificially-generated data that is not simply 
produced in a natural source through a different means, but is produced entirely 
artificially, in a constructed world. For this reason, the theoretical backstory for 
agent-based models must be explicit, as otherwise a connection between the model 
and related data from the natural world becomes difficult to establish. 
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8.8.4 Using the Power of Simplicity 


As discussed throughout all three parts of this text, the issue of complexity in 
models has numerous ramifications. For the researcher, highly complex models are 
difficult to develop successfully; there are a number of choices to be made at the 
implementation level. For example, an evolutionary model requires a number of 
parameters to govern the evolution and reproduction of its agents. How should the 
agents replicate? Should crossover and sexual reproduction be used? What sort of 
mutation rates might be necessary, and will test runs of the simulation illuminate 
the best rate of mutation with which to produce interesting results? Additional 
complexities such as neural network elements or detailed agent physiologies require 
even more numerous questions at the implementation level. 

Similarly, as discussed in relation to the Levins framework, greater complexity 
leads to greater difficulties in tractability (in the context used here, a greater 
difficulty in producing substantive analysis, as well as computability). Levins 
discussed the possibility of producing models of such complexity that they exceed 
the cognitive limitations of the scientist studying them (Levins 1968). For the 
mathematical modeller, a model which captures every possible factor in a migrating 
bird population could end up consisting of hundreds of linked partial differential 
equations, a mathematical morass of such complexity that analysis becomes fruit- 
less. For the computational modeller, a model of similar character could produce 
highly-detailed agents with individual physiologies and complex neural structures 
that allow for remarkably rich behaviour; yet, such a scenario is a nightmare for 
the analyst, for whom divining the function and impact of these complex internal 
structures takes incredible amounts of time even for the simplest neural networks 
(see Chap. 4 for discussion in relation to Beer’s model). 

Thus, we must take inspiration from Schelling in this respect. Greater ease 
in implementation and analysis are two enormous advantages of simpler models. 
Particularly in the case of social theory, where the potential complexities when 
studying human society in simulated form are vast, simple and elegant models 
which illuminate crucial aspects of social systems are the most likely to produce 
substantive insights. 


8.9 Conclusions and Future Directions 


8.9.1 Forming a Framework 


This text has sought to investigate in-depth both the theoretical and methodological 
concerns facing researchers who utilise agent-based modelling techniques. By 
examining these issues, and developing frameworks to understand the limitations of 
such models, future directions for substantive modelling research can be identified. 
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Our early analysis of ALife in the beginning of the first section of this text 
demonstrated the importance of theoretical justification in the model-building 
process, as well as the potential theoretical pitfalls of artificiality in simulations. 
Whether it is Newell and Simon’s PSS Hypothesis, or Silverman and Bullock’s 
PSS Hypothesis for Life, computational models require a theoretical basis before 
they can fit into the conventional tapestry of empirical methods. In the case of 
both of these frameworks, the implied empirical acceptability of models built on 
such premises comes at the price of philosophical baggage for the modeller to bear. 
Those who choose not to take such strong philosophical positions, as in weak ALife, 
find themselves in the situation of facing more difficult theoretical problems despite 
avoiding philosophical conundrums. 

This baggage became increasingly evident during our examination of Levins’ 
modelling framework (Levins 1966). The modeller who seeks to produce useful 
insights about natural systems must strike a difficult balance between both Levins’ 
proposed three modelling dimensions and the additional aspect of tractability. This 
fundamentally limits the ability of a researcher to develop models that capture the 
complexities of natural systems; instead, they are left waiting for techniques that 
may enhance tractability while still allowing a rough balance of the three Levinsian 
dimensions. 

Initially, our analysis of social simulation in the second section of this text 
seemed even more problematic than for the ALife community. Social simulations 
suffer the same methodological and theoretical complexities of ALife simulations, 
but with the added problem of increased layers of complexity through the addition 
of social considerations. Even assuming that such problems could be surmounted, 
the difficulty of providing complete social explanation remained. 

However, the introduction of Luhmannian systems sociology introduced another 
means of utilising simulation in social science. In the case of developing new social 
theory, creating models that are closer to reality in fact poses a fundamental prob- 
lem: the issue of theory-dependence prevents the social scientist from producing 
simulations that avoid biases drawn from our own societal experience. Artificiality 
thus becomes a desirable trait, bringing the social theorist away from pre-existing 
theoretical biases in his models. The lure of artificial worlds in this case is thus 
practical: rather than attempting to avoid the inherent difficulties of modelling 
illuminated by Levins, and instead losing explanatory capacity, the Luhmannian 
modeller avoids theory-dependence and gains the ability to generate new elements 
of social theory. 

In essence, the root of the issue comes back to artificiality and intent. A modeller 
who wishes to create an Artificial’ instance of life or mind can simply look to 
the PSS Hypotheses. One who wishes to mimic a natural behaviour as accurately 
as possible, without claiming any theoretical revelations as a result, can simply 
declare his work Artificial’. A weak ALife researcher, or a similar perspective 
from other disciplines, can apply his Artificial? simulation to the examination 
of a natural system or an element of human society by balancing his model’s 
dimensions within the modified Levins framework, and ensuring both tractability 
and a reasonable theoretical backstory (as in Schelling and his focus on general 
insight and transparency in modelling). 
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For those who wish to apply their models to the development of new social 
theory, the issue of artificiality becomes more nuanced. The modeller in this case 
does not seek an Artificial! instantiation of a natural phenomenon, nor does he seek 
an Artificial? imitation of something natural. Instead, the modeller seeks to develop 
a simulation in which pre-existing elements of the natural system are removed, along 
with pre-existing theoretical positions on that system, in order to develop theories 
independent of bias. 


8.9.2 Messages for the Modeller 


Now that we have examined a number of modelling frameworks in detail, analysed 
and compared each in Chap. 6, and applied them to Schelling in Chap. 7, we have 
been able to ascertain the most fruitful directions for future models in the social 
sciences to take. However, on a more general level, the models proposed here will 
not be the entirety of social simulation. Indeed, a great variety of different types of 
models will continue to develop in this field, and while Luhmannian modelling may 
hold great promise for those interested in formulating new social theory and in using 
simulation by playing to its greatest strengths, this is not to say that other types of 
models in social science must take a lesser role. 

For the social science modeller interested in producing different varieties of 
models, this argument has brought forth a number of philosophical and method- 
ological frameworks which can provide insight into making those models more 
relevant to empirical social science. Our discussion of modelling frameworks in 
Alife illuminated the importance of creating a theoretical backstory; since modelling 
is inherently a theory-dependent approach, an understanding and elucidation of the 
theoretical underpinnings of a given model is vital for creating a basis upon which 
to understand that model. Merely stating that the model is somehow reminiscent of 
a real-world incidence of some social phenomenon is not enough to create a useful 
justification. 

Our subsequent discussion of modelling in social science saw these frameworks 
applied to a new area of enquiry, and described the various theoretical problems 
at issue in the field which have a significant impact for the model-builder. While 
the idea of a balancing act inherent in creating a model between complexity and 
analysability is nothing new, bringing the Levins framework into the discussion 
in social science helps to illuminate more specific pragmatic modelling concerns 
that are important in computational models. The in-depth discussion of problems 
of explanation in social science demonstrates how the complexities of modelling 
human society create additional difficulties for the modeller, adding to the problems 
of strong emergence in the field of Alife. The modeller interested in social science 
must take care to understand the limitations of the computational approach, and to 
avoid incorporating too many elements which may lead to unanalysable complexity. 
Models offer some new research directions to the social scientist, but also new 
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methodological difficulties; models cannot solve all of these problems and must 
be deployed appropriately in conjunction with empirical data and useful theoretical 
background. 

Finally, we applied the frameworks discussed in relation to both Alife and social 
science simulation to one particular example, that being Schelling’s residential 
segregation model. Schelling provides the best means for demonstrating the tensions 
between model implementation and model design within social science. The 
model’s inherent simplicity limits the conclusions which one can draw from its 
results; yet, that same complexity allows for greater communicability and impact 
among the social science community. Replicating and extending Schelling’s model 
is simple in comparison to many computational models, and as a consequence a 
greater number of the social science community could join in the conversation 
spurred by the creation and dissemination of the model. In this respect, Schelling’s 
limitation of his model’s scope to a simple demonstration of the possible impact of 
a single individual factor on a complex social problem was a big part of its success; 
the modeller would do well to remember this point as a reinforcement that models 
need not encompass every element of a given problem to create intriguing insight 
and new directions for empiricists and theorists. 


8.9.3 The Tension Between Theory-Dependence 
and Theoretical Backstory 


A central element running throughout the three major sections of this text is the 
discussion of theory-dependence in models. Early on in the Alife discussion this 
element became important, and the possible theoretical biases on display in a given 
model become a serious potential difficulty in applying that model to the field in 
question. The proposed solution to this issue has been the creation of a salient 
theoretical backstory, one which describes the reasons which make a given model 
acceptable as contributing in some important way to the discourse of the field 
to which it is applied. By doing so, the researcher can justify their model and 
its conclusions as important and not simply relevant only to those interested in 
mathematical curiosities; as discussed in Chap. 3, the Alife researcher might do so 
by contending that their model is an instance of an information ecology and thus an 
instance of empirical data-collection despite its inherently artificial nature. 
However, such theoretical backstory can also limit the applicability of a model. 
Too strong of a theoretical framework, or one which does not hold under closer 
scrutiny, can result in a model which suffers from greater theory-dependence, 
rather than less. The tension here then is the delicate balance between providing a 
theoretical context for a model, and having that model stuck too deeply in theoretical 
elements that limit its usefulness to the broader field. There is no easy solution to 
this tension, as these pages have demonstrated. However, the modelling frameworks 
discussed herein provide guidelines to avoid such pitfalls. A careful examination 
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of the theoretical and pragmatic circumstances surrounding a model can help the 
modeller to avoid both theoretical pitfalls that may make his model’s applicability 
suspect (as in Webb’s discussion of Beer Webb 2009), and to avoid those elements of 
model construction and implementation that may lead to great difficulties in analysis 
and the presentation of useful results. 


8.9.4 Putting Agent-Based Modelling for the Social 
Sciences into Practice 


Having investigated the use of agent-based models in Alife and social science, we 
are more aware of the issues facing modellers in these areas and have developed 
frameworks to help us navigate these complexities. As modellers seeking the most 
effective application of agent-based models to the social sciences, our work is not 
yet done, however. General principles for effective modelling are of great value, 
but within each discipline and sub-discipline of the social sciences there are further 
complexities to face before we can put these ideas into practice. 

Creating individualised modelling frameworks for all the major disciplines of 
the social sciences is obviously beyond the scope of this text. In order to situate our 
modelling ideas properly in an individual discipline requires in-depth knowledge of 
its history, scope and research aims. We would need to analyse the methods of each, 
determining how agent-based models can enhance researchers’ efforts in that field. 
Finally, in order to be truly convincing to those looking to branch out into agent- 
based approaches, we would need to present some worked examples that illustrate 
the potential of these methods to contribute useful knowledge. 

In Part II, we will build upon the foundations presented thus far and offer 
an example of the development of a model-based approach to a social science 
discipline — in this case demography, the study of populations. Demography is a 
field with a long history, and close ties to empirical population data. We will present 
a framework for model-based demography, a considered approach to demographic 
simulation that takes into account the needs of the discipline and its practitioners. 

In order to develop a model-based demography, we will start by analysing the 
methodological development of the field, from its earliest beginnings to the bleeding 
edge. Demography, given its nature and close ties to real-world data and policy, 
tends toward the social simulation approach to social science modelling. As a 
consequence, our analysis will focus particularly on the demographic approach to 
data and how simulation can maintain the field’s focus on real-world populations. 
We will show that demography not only can live up to demographers’ expectations 
in this respect, but it can even enhance their ability to gain new insight about 
the processes and functions underlying demographic change by helping us to 
address demography’s core epistemological challenges of uncertainty, aggregation 
and complexity. 
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We will then examine some worked examples of agent-based modelling, with a 
particular focus on two modelling projects which will be presented in detail. These 
two models illustrate how agent-based models can work in tandem with traditional 
statistical demography to build simulations that closely mirror the behaviour of the 
real-world populations under study. Finally, we will discuss the status of model- 
based demography and its potential to contribute to the discipline, and theorise about 
the implications of this effort at methodological advancement for other areas of 
social science that are experimenting with a model-based approach. 


References 


Axelrod, R. (1997). The complexity of cooperation. Princeton: Princeton University Press. 

Bedau, M. A. (1997). Weak emergence. In J. Tomberlin (Ed.), Philosophical perspectives: Mind, 
causation, and world (Vol. 11, pp. 375-399). Oxford: Blackwell. 

Beer, R. (2003a). Arches and stones in cognitive architecture. Adaptive Behavior, 11(4), 299-305. 

Beer, R. (2003b). The dynamics of active categorical perception in an evolved model agent. 
Adaptive Behavior, 11(4), 209-243. 

Benito, J. M. (2007). Modelling segregation through cellular automata: A theoretical answer. 
Working Papers Serie AD, Instituto Valencia de Investigaciones Economicas (pp. 2007-16). 

Cederman, L. E. (2001). Agent-based modeling in political science. The Political Methodologist, 
10, 16-22. 

Doran, J. E. (2000). Trajectories to complexity in artificial societies. In A. Kohler & G. Gumerman 
(Eds.), Dynamics in human and primate societies. New York: Oxford University Press. 

Epstein, J. (1999). Agent-based computational models and generative social science. Complexity, 
4(5), 41-60. 

Fleischmann, A. (2005). A model for a simple Luhmann economy. Journal of Artificial Societies 
and Social Simulation, 8(2), 4. 

Kliiver, J., Stoica, C., & Schmidt, J. (2003). Formal models, social theory and computer 
simulations: Some methodical reflections. Journal of Artificial Societies and Social Simulation, 
6(2), 8. 

Levins, R. (1966). The strategy of model-building in population biology. American Scientist, 54, 
421-431. 

Levins, R. (1968). Evolution in changing environments. Princeton: Princeton University Press. 

Luhmann, N. (1995). Social systems. Stanford: Stanford University Press. 

Pancs, R., & Vriend, N. J. (2007). Schelling’s spatial proximity model of segregation revisited. 
Journal of Public Economics, 91, 1—24. 

Reynolds, C. W. (1987). Flocks, herds and schools: A distributed behavioural model. Computer 
Graphics, 21(4), 25-34. 

Sawyer, R. K. (2002). Nonreductive individualism: Part I: Supervenience and wild disjunction. 
Philosophy of the Social Sciences, 32, 537-561. 

Sawyer, R. K. (2003). Artificial societies: Multi-agent systems and the micro-macro link in 
sociological theory. Sociological Methods and Research, 31(3), 37-75. 


References 163 


Sawyer, R. K. (2004). Social explanation and computational simulation. Philosophical Explo- 
rations, 7(3), 219-231. 

Schelling, T. C. (1971). Dynamic models of segregation. Journal of Mathematical Sociology, 1, 
143-186. 

Schelling, T. C. (1978). Micromotives and macrobehavior. New York: W. W. Norton. 

Webb, B. (2009). Animals versus animats: Or why not the real iguana? Adaptive Behavior, 17(4), 
269-286. https://doi.org/10.1177/10597 12309339867. 


Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 
International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, 
adaptation, distribution and reproduction in any medium or format, as long as you give appropriate 
credit to the original author(s) and the source, provide a link to the Creative Commons license and 
indicate if changes were made. 

The images or other third party material in this chapter are included in the chapter’s Creative 
Commons license, unless indicated otherwise in a credit line to the material. If material is not 
included in the chapter’s Creative Commons license and your intended use is not permitted by 
statutory regulation or exceeds the permitted use, you will need to obtain permission directly from 
the copyright holder. 


Part III 
Case Study: Simulation in Demography 


Having established now a theoretical basis behind the use of agent-based modelling 
methodologies within the social sciences, the question remains: how might we apply 
this theoretical framework in practice within our chosen disciplines? In Part III we 
will examine this question in detail, through the lens of demography. 

Demography is a powerful and relatively ancient discipline, tracing its origins 
back to the seventeenth century. Demographic studies are frequently influential 
policy tools, driving large-scale changes in social and governmental policy at 
local, national and even international levels. Population data is, by nature, rich in 
detail and highly useful for policy-makers across the political spectrum. 

Demographers have in recent years turned toward simulation, seeking a new 
approach which marries the statistical power of traditional demographic methods 
with the ability to understand and manage complexity. In particular, the drive 
to understand the elusive micro-macro link — the ways in which individual-level 
behaviours lead to population-level changes — has motivated demographers to seek 
out agent-based modelling methods, which can incorporate both individual-level 
behaviours and population-level influences and policies. 

In the chapters ahead we will discuss the history and progress of demography as 
a discipline, the potential of ABMs to contribute to these vital research goals, and 
the ways in which simulation can be applied in this context. We will outline a new 
research programme for demography which incorporates simulation modelling as 
a core component, a programme which will help steer demography toward a more 
expansive future — and which could serve as a model for other disciplines seeking 
to incorporate simulation into their methodological toolbox. 


Chapter 9 
Modelling in Demography: From Statistics 
to Simulations 


Jakub Bijak, Daniel Courgeau, Robert Franck, and Eric Silverman 


9.1 An Introduction to Demography 


Demography, the study of population change, is a discipline with a lengthy and 
storied history. Most consider demography to have begun approximately 350 years 
ago, through the work of Graunt (1662), and the field continues to evolve today 
and incorporate new methods (Courgeau et al. 2017). As discussed in Parts I and II, 
the advent of agent-based modelling has brought with it the possibility of applying 
simulation methodologies to the social sciences, and in this respect demography is 
no exception. 

Recent work in demography has identified agent-based modelling in particular 
as a method with particular relevance for demographers; Billari et al. suggest the 
incorporation of these methods could result in the development of a new subfield of 
agent-based computational demography, or ABCD (Billari and Prskawetz 2003). 
ABMs have also been cited as a way to increase the theoretical relevance of 
demography, by inspiring demographers to delve deeper into social theory in their 
quest to design and parameterise more sophisticated models (Silverman et al. 2011; 
Burch 2002, 2003). 

In this chapter we will go further and propose that ABMs, beyond just herald- 
ing the birth of a new subfield, have the potential to push the demographic 
research agenda in a new direction.' We will summarise the historical development 
of demography from Graunt’s time until the present day, and demonstrate the 
demography has displayed a penchant for incorporating new methodologies and 


'This chapter is based upon ideas from Courgeau et al. (2017), please see the original paper for 
some additional detail of some finer points related to demographic history and knowledge. 
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building upon the work of years past. We will propose that ABMs and related 
simulation methods can form the foundations for a new model-based demography, a 
step forward for the progression of demography in the face of an ever more complex 
and changeable world. 

In Sect.9.2 below, we will discuss the historical development of demographic 
methods. We will follow this in Sect. 9.3 with a description of the challenges faced 
by demography due to uncertainty and complexity in human populations. Sec- 
tion 9.4 will present the means by which demographers can incorporate simulation 
methods into the scientific programme, and Sect. 9.5 will bring this all together into 
a proposal for a model-based demography. 


9.2 The Historical Development of Demography 


Demography, as mentioned above, is commonly thought to have begun with the 
work of Graunt in the seventeenth century (Graunt 1662). While the discipline 
itself may be ancient, over time the methods used in demography have evolved 
continuously, and demographers have incorporated a wide variety of approaches 
into their methodological toolbox. 

Note that throughout this chapter, we will refer to these methodological changes 
as ‘paradigms’, but we are using the term somewhat differently from the modern- 
day Kuhnian interpretation (Kuhn 1962). Here we use ‘paradigm’ to refer to the 
relationship between observed phenomena within a population and the key factors 
of mortality, fertility and migration which are used to explain population change in 
demography. We will identify four main paradigms over the course of demography’s 
history, and outline the primary differences between each and how these approaches 
have built on one another cumulatively over the centuries. 

Following Courgeau et al. (2017), we will recall Bacon’s seventeenth-century 
elaboration of the inductive method as we begin to discuss the origins of demo- 
graphic thought (Bacon 1863)[aphorism 19]: 


There are and can be only two ways of searching into and discovering truth. The one flies 
from the senses and particulars to the most general axioms, and from these principles, 
the truth of which it takes for settled and immovable, proceeds to judgement and to the 
discovery of middle axioms. And this way is now in fashion. The other derives from the 
senses and particulars, rising by a gradual and unbroken ascent, so that it arrives at the most 
general axioms at last of all. This is the true way, but as yet untried. 


Here Bacon sets up a contrast between the dominant methods in his time, and 
a second approach which became a foundational principle for modern scientific 
thought. In the former case, axioms are derived from human notions and intuitions, 
rather than observation of nature itself. Courgeau et al. argue that the ‘idols’ which 
derive from such an approach are still a problem in certain areas of science, even in 
the modern era (Courgeau et al. 2014). 
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Bacon’s alternative proposal requires detailed observation of the object of study 
and the development of axioms by induction. Through observation and experimen- 
tation, we can discover the principles governing the natural or social properties we 
seek to study. As argued by Franck (2002b), induction in the Baconian method 
requires that these principles are such that if they were not present, ultimately the 
properties we observe would take an entirely different form. 

These foundational ideas were essential to the development of the first era 
of demography. Before Graunt, human events like births and deaths were not 
something to be studied or predicted, but were instead within the strict province 
of God’s plans for humanity. Graunt instead brought us the concept of the statistical 
individual — an important concept which we will revisit later in Part III — which 
would experience abstract events of fertility and mortality (Graunt 1662; Courgeau 
2007). This key conceptual revolution allowed for the development of a science of 
human population, and thus to the advent of demography, epidemiology, and other 
related fields. 

Graunt also demonstrated the importance of probability in population studies, 
based upon the work of Huyghens (1657). He was able to use an estimation of 
the probability of death to estimate the population of the city of London, and in 
so doing was the first to use the concept of a statistical individual to examine 
population change scientifically. This inspired the work of contemporaries, such as 
Sir William Petty and his Political Arithmetick (Petty 1690), and was significant for 
the development of political economics in general. In the ensuing decades, and the 
following century, European thinkers continued to develop this school of thought. 


9.2.1 Early Statistical Tools in Demography 


The concept of epistemic probability brought forth by Bayes (1763) and Laplace 
(1774, 1812), had a significant impact on demography. We will not go into a great 
deal of detail on the specifics here; for that, please turn to Courgeau (2012) and 
Courgeau et al. (2017) for a more in-depth summary. Speaking broadly, the advent 
of these techniques allowed demographers to answer more salient questions about 
human events through the use of prior probabilities, which had notable benefits for 
the growing field of population science. Similarly, the least-squares method drawn 
from astronomy began to be applied to demography as well, and over the course 
of the nineteenth century this method became quite widely used (Courgeau 2012). 
Censuses becoming more widely used meant that extensive data was also much 
more accessible. 

However, these early statistical tools assumed that the variables under inves- 
tigation displayed a certain mathematical structure, and these structures are not 
necessarily evident in the real world. This can lead to the ecological fallacy, 
meaning that aggregate, population-level data cannot be applied to the study of 
individual-level behaviour. The data being collected during this period was also 
entirely period-based or cross-sectional (Courgeau 2007). This cross-sectional 
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paradigm implied that social factors influencing individuals are a result of aspects 
of the society surrounding them (i.e., political, economic, or social characteristics). 
As we will soon see, this separation of individuals and their social realities does not 
hold up under further scrutiny in many cases. 


9.2.2 Cohort Analysis 


The development of the cohort analysis paradigm happened in the United States 
following World War II. Ryder (1951) was one of the early proponents, but Henry 
(1959) further developed and formalised the cohort approach. Courgeau summarises 
the core concept of the cohort analysis paradigm (Courgeau 2007, p. 36): 


The demographer can study the occurrence of only a single event, during the life of a 
generation or cohort, in a population that preserves all its characteristics and the same 
characteristics for as long as the phenomenon manifests itself. 


In order for these analyses to work well, however, we must assume that the pop- 
ulation is homogeneous and that any interfering phenomena must be independent. 
Such restrictions meant that demographers quickly sought out methods to study 
heterogeneous cohorts and interdependent phenomena. Thus Aalen developed a 
demographic application of a general theory of stochastic processes (Aalen 1975). 


9.2.3 Event-History Analysis 


The resultant event-history paradigm built upon these foundations allowed us to 
study the complex life histories of individuals (Courgeau and Leliévre 1992). We 
are able to identify how both demographic and non-demographic factors affect 
individual behaviour. Of course, these analyses require extensive data; we need to 
follow individuals through their lives and collect information on their individual 
characteristics and the events which befall them. This means that longitudinal 
surveys become highly important in this type of demographic research. 

The event-history paradigm enforces a collective point of view, in which we 
estimate the parameters of a random process that affects all individuals, their trajec- 
tories through life, via analysis of a sample of individuals and their characteristics. 
This is perhaps conceptually difficult, but in essence we are seeking understanding 
of a process underlying all of these individual trajectories, rather than insight into 
the individuals themselves. Again we are studying statistical individuals, not real, 
observed individuals. 

However, in contrast to the ecological fallacy of the cross-sectional paradigm, 
here we may fall afoul of the atomistic fallacy, in which our focus on individual 
characteristics leads us to ignore the broader, societal context in which individual 
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behaviours develop. As described in Part II, individual behaviours are inextricably 
tied to the complex, multi-layered society in which they live, so isolating these 
processes can lead to misleading results and incorrect conclusions. 


9.2.4 Multilevel Approaches 


In order to surpass these limitations, we must introduce the concept of social 
groupings, which can include families, social networks, workplaces, political affili- 
ations, and many more besides. Alongside these concepts demographers developed 
multilevel analyses to better link the individual with the social context around them 
(Mason et al. 1983; Goldstein 1987). This multilevel paradigm can resolve the gap 
between population-level models and individual event-histories: 


The new paradigm will therefore continue to regard a person’s behaviour as dependent on 
his or her past history, viewed in its full complexity, but this behaviour can also depend on 


external constraints on the individual, whether he or she is aware of them or not. (Courgeau 
2007, pp. 79-80) 


Both the ecological and atomistic fallacies fade away under this new approach, 
given the direct links that are now possible between individuals and their social 
context: 


The ecological fallacy is eliminated, since aggregate characteristics are no longer regarded 
as substitutes for individual characteristics, but as characteristics of the sub-population in 
which individuals live and as external factors that will affect their behaviour. At the same 
time, we eliminate the atomistic fallacy provided that we incorporate correctly into the 
model the context in which individuals live. (Courgeau 2007, pp. 79-80) 


9.2.5 Cumulativity 


As demonstrated in the brief historical summary above, demography has advanced 
over the centuries due to a steady process of advancement through a series of 
paradigms. Each new paradigm has taken previous approaches as a starting point, 
identified their shortcomings and offered a means to overcome them. Having said 
that, the new paradigms have not eliminated the old; period, cross-sectional and 
cohort analyses remain relevant today, and are still used when the research question 
being posed would be suitably answered by one of those approaches. This is 
reminiscent of the situation in physics, where Newtonian physics is still perfectly 
relevant and useful in situations where relativistic effects have little or no impact. 

In demography, following Courgeau et al. (2017), we describe this as a cumu- 
lativity of knowledge, in which new paradigms have added new perspectives while 
retaining the insights provided by previous ones: 
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Table 9.1 Four successive paradigms in demography 


Paradigm Period Key focus 

Period 1662- Macro-level phenomena 

Cohort 1950s- | Macro-level phenomena, measured along the cohort dimension 
Event history 1980s- | Micro-level phenomena 

Multilevel 1980s— Macro-, micro-, and meso-level phenomena, measured from 


multiple perspectives 


Cumulativeness of knowledge seems self-evident throughout the history of population 
sciences: the shift from regularity of rates to their variation; the shift from independent 
phenomena and homogeneous populations to interdependent phenomena and heteroge- 
neous populations; the shift from dependence on society to dependence on the individual, 
ending in a fully multilevel approach. Each new stage incorporates some elements of the 
previous one and rejects others. The discipline has thus effectively advanced thanks to the 
introduction of successive paradigms. (Courgeau 2012, p. 239) 


Table 9.1 provides a summary of the four paradigms described above. Each of the 
four seeks to examine the key processes in population sciences — mortality, fertility 
and migration. However, each paradigm characterises the relationships between 
these processes differently. 

These paradigms are not static, either; each is undergoing continuous refinement 
over time. In addition, these four current paradigms still struggle to resolve certain 
types of research questions. For example, examining interactions between different 
elements of complex populations remains beyond the scope of even the cutting-edge 
event-history approach. This difficult problem, often referred to as the micro-macro 
link in demographic circles, is a familiar one from our perspective, forming one a 
significant motivation for deploying agent-based models and social simulations in 
general. Conte et al. offer a succinct description of the micro-macro link’: 


...the loop process by which behaviour at the individual level generates higher-level 
structures (bottom-up process), which feedback to the lower level (top-down), sometimes 
reinforcing the producing behaviour either directly or indirectly. (Conte et al. 2012, p. 336) 


In addition, the micro-macro link is not necessarily uni-directional; higher- 
level actions, for example political decisions, can affect individual behaviours, 
which might then necessitate additional policy measures, and so forth. Multilevel 
approaches cannot cope with this kind of bidirectional effect. As we will see, a 
model-based demography may be better-placed to help demography cope with this 
complex aspects of population change. 


?We would go further, however, and note that sometimes the higher-level behaviours can go in 
the opposite direction from those at the lower level, producing what Boudon (1977) refers to as 
“perverse effects”. 
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9.3 Uncertainty, Complexity and Interactions 
in Population Systems 


As described in Part II, studying the complexities of human social interaction 
introduces a host of challenges for the modeller. These challenges are worth re- 
examining in the specific context of demography, as here they take on a somewhat 
different character than in other social sciences. Here we will identify the three 
primary epistemological challenges facing demography today, which will further 
inform our development of a model-based research agenda. 

Demography, while incorporating aspects of both individual- and population- 
level behaviour and their attendant complexities, benefits from having frequent 
access to very rich datasets, due to the inherent usefulness of those datasets 
for governments throughout the world. Demographic data also displays strong 
and persistent relationships, and much critical information on future population 
dynamics is already embedded in a population’s age structures. The long-term 
empirical focus of demography has allowed for these relationships to be examined 
in significant detail (Xie 2000; Morgan and Lynch 2001). 


9.3.1 Uncertainty in Demographic Forecasts 


The three main processes of population change — mortality, fertility, and migration — 
all display significant amounts of uncertainty (Hajnal 1955; Orrell 2007). However, 
the relative levels of uncertainty differ between them; mortality is generally 
considered the least uncertain, and migration the most uncertain (National Research 
Council 2000). As demographers have come to accept the significant challenge 
posed by uncertainty, statistical demography has grown significantly in recent 
decades. Courgeau refers to this as the “return of the variance” to demography 
(Courgeau 2012). 

The limits of predictability in demographic forecasting has been a topic of signif- 
icant discussion within the demographic community (see Keyfitz 1981; Willekens 
1990; Bijak 2010). Demographers have argued that forecasts should move from 
deterministic to probabilistic approaches, for example Alho and Spencer (2005). 
The field also acknowledges that predictions beyond a relatively short time horizon 
— a generation or so at most — have such high levels of uncertainty that scenario- 
based approaches to forecasting should be favoured (Orrell and McSharry 2009; 
Wright and Goodwin 2009). 


9.3.2 The Problem of Aggregation 


Here we refer once again to the ecological and atomistic fallacies described above. 
While the advent of the event-history approach and related methodologies like 
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microsimulation has moved demography away from focusing exclusively on either 
the individual or the population, these methods are still relatively new (Willekens 
2005; Courgeau 2007; Zinn et al. 2009). Microsimulation models are both multi- 
level and multi-state, meaning that individuals can move between states (such 
as health status, age group, socioeconomic class, etc.) according to transition 
probabilities estimated from survey data or census data. 

However, while these methods are certainly powerful, the challenge for demog- 
raphers has been their ever-increasing data requirements. The parameter space 
explodes in size as the ambition of these models grows, and thus demographers 
find themselves at the mercy of either data that is too limited to accommodate their 
research questions, or are simply unable to collect sufficient data in the first place 
due to prohibitive cost or organisational difficulties. We will examine this particular 
point in more detail in Sect. 9.8 when we discuss “feeding the beast’. 


9.3.3 Complexity vs. Simplicity 


The third main epistemological issue for demographers is a direct consequence of 
the challenges of uncertainty and aggregation. While the temptation in demography 
today is to tend toward ever more complex and sophisticated models, whether 
these models are actually more powerful than their simpler neighbours is still an 
open question. Demography is fundamentally a discipline focused on predicting 
population change, and in that respect, there is no evidence to suggest that complex 
models outperform simpler ones (Ahlburg 1995; Smith 1997). 

Having said that, however, if we were to react too strongly to this revelation 
and throw aside complex models in favour of simpler ones, we may not achieve the 
results we desire. Developing detailed understanding of demographic processes may 
require coping with highly complex datasets and interactions. In those instances, 
simplicity may abstract away too many of the relevant factors for us to identify the 
key elements in the processes we wish to study. Prediction, after all, is a key goal 
in demography, but is far from the only goal; understanding and explanation are just 
as valid goals for us to pursue, and — unfortunately for us — often we cannot escape 
the impact of complexity in that context. 


9.3.4 Addressing the Challenges 


While these three key challenges are quite significant, demography has moved 
forward in partnership with statistical innovations to develop techniques that 
can help us cope with these new realities. For example, recent developments in 
uncertainty quantification for complex models has made clear that models are 
themselves a source of uncertainty, right alongside the factors mentioned above 
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(Raftery 1995). Bayesian statistics has presented us with several approaches to 
dealing with this aspect, such as including a term for model error or code error 
while building a model (Kennedy and O’ Hagan 2001). 

The Bayesian perspective has also informed new approaches to mapping the 
relationships between model parameters and model outputs, even in highly complex 
computational models. Perhaps most accessible among these has been the Gaussian 
process emulator, a method for analysing the impact of model parameters on 
the final output variance (Kennedy and O’Hagan 2001; Oakley and O’Hagan 
2002). While this approach has been most commonly used in highly complex 
computational simulations like global climate models, Gaussian process emulators 
have also been put to use in demographic projects? in recent years (Bijak et al. 2013; 
Silverman et al. 2013a,b). 

Thus, as demography has continued to advance to cope with the challenges 
wrought by complexity, it has moved toward methods and perspectives more 
commonly associated with complexity science and related disciplines. The prospect 
of incorporating more exploratory modelling practices within the discipline has led 
some to seek a movement toward demography as a model-based science (Burch 
2003; Courgeau et al. 2017), much like in population biology (Godfrey-Smith 2006; 
Levins 1966). 


9.4 Moving Toward a Model-Based Demography 


Having established that demography, and the population sciences more broadly, 
have begun to move toward a model-based paradigm and incorporate insights from 
disciplines already inclined in that direction, we will revisit some concepts from 
Part II in order to start to bring together a coherent framework. 

In Chap. 5, we outlined a key distinction between two streams of modelling for 
the social sciences: social simulation and systems sociology. Systems sociology is 
fundamentally a more explanatory, and exploratory, form of modelling in which we 
focus on understanding foundational social theories that lead to the development and 
evolution of society. Demography, generally speaking, clearly leans more toward the 
social simulation stream, in which the focus is on modelling specific populations and 
developing powerful links with empirical data. Microsimulations, for example, fall 
under this category, given their dependence on transition probabilities derived from 
empirical population data (Zinn et al. 2009; Willekens 2005). 

Huneman (2014) suggests that we can further distinguish simulation approaches 
within that social simulation branch, between weak and strong simulations. The 
former aims for a scientific approach, looking to test a hypothesis even when data 
is hard to come by. The latter lies more in ‘opaque thought experiment’ territory, 


3We will examine the application of Gaussian process emulators to demography in significant 
detail later in Part III. 
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looking to explore simple models without being dependent on a specific theoretical 
basis. In the context of modelling for the social sciences there are strong similarities 
here of course to the systems sociology approach, in that in both cases we seek to 
step away from strong empirical ties and examine theories at a more foundational 
level. 

However we may characterise these more abstract models — as systems sociology 
models or strong simulations — demography has at its core a strong commitment 
to population data and empirically-relevant results. In that context, demographers 
seek empirically-derived foundations for modelling social mechanisms within a 
model; the exploratory, generative approach does not provide the kind of insight 
demographers find most valuable. In order to develop these foundations in a model- 
based demographic approach, we can bear in mind the suggestion of Conte et al. 
that: 


... simulations must be accompanied by micro-macro-loop theories, i.e., theories of mech- 
anisms at the individual level that affect the global behavior, and theories of loop-closing 
downward effects or second-order emergence. (Conte et al. 2012, p. 342) 


Thus, a critical component of this modelling enterprise must be developing an 
understanding of this micro-macro link, and in the context of a simulation approach 
that suggests we must remain committed to a multilevel approach. This provides 
certain advantages as well, in that powerful tools already exist for multilevel 
modelling within demography; in a sense, then, we are simply updating the way 
in which these levels are being represented by putting them into simulation. 


9.4.1 The Explanatory Capacity of Simulation 


As we have discussed at length previously, agent-based models are uniquely 
positioned to provide greater explanatory power when applied to complex adaptive 
systems. This is just as attractive within a demographic context as it is for other 
social sciences (Burch 2003; Silverman et al. 2011). By allowing the modeller to 
represent the interactions between individuals and macro-level processes, agent- 
based models can grant us greater insight into how these different levels of 
activity influence one another (Chattoe 2003). However, taking advantage of this 
aspect requires that we develop a more sophisticated understanding of these 
interactions themselves; in the empirically-focused demographic context, simply 
creating behavioural rules for these interactions out of best-guess intuitions is not 
sufficiently rigorous. 

In order to delve more deeply into interactions between these levels of analysis, 
we may situate these interactions themselves as objects of scientific enquiry. By 
explicitly modelling these interactions in simulation, we can better represent the role 
of multiple, interacting systems in the final demographic outcomes we see in our 
empirical observations. This would shift demography more toward a model-based 
framework, and in so doing allow demographers to contribute more to theoretical 
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advancements in the study of population change. To an extent this shift has already 
begun, as the incidence of demographic agent-based models influenced by theories 
of social complexity has increased since the turn of the century (Kniveton et al. 
2011; Bijak et al. 2013; Silverman et al. 2013a; Willekens 2012; Geard et al. 2013). 


9.4.2 The Difficulties of Demographic Simulation 


While the prospect of a model-based demography offers many advantages, no 
approach comes free of drawbacks. As discussed in Part II, demography — and 
social science more generally — presents a difficult target for simulation modellers 
given the need for robust social theories to underpin their simulation efforts. Social 
theories are not difficult to find, but they are difficult to validate (Moss and Edmonds 
2005). While demography differs from other social sciences in its applied focus and 
the rich population data from which it draws its insights (Xie 2000; Hirschman 
2008), demographers interested in simulation must still rely on a solid theoretical 
backdrop in order to justify the conclusions drawn from their models. 

For demography to move forward as a model-based discipline, particularly with 
agent-based models, the discipline’s practical focus must be maintained. This means 
that simulations demographers build must be underpinned by population data, 
and, crucially, they must be constructed inductively. To do otherwise would be to 
construct social simulations that, while perhaps enlightening in terms of testing 
social theories, would have little to say about the core questions that have motivated 
demographers for these last 350 years. 

As Courgeau et al. suggest (2017), these tensions between the expansive explana- 
tory power of simulation and the focused empirical character of demography are not 
necessarily unresolvable. Following the example set by the historically cumulative 
progression of demographic knowledge outlined earlier, a model-based demography 
can build upon the power of the multilevel paradigm, incorporating the capabilities 
afforded by simulation approaches. In this way we establish a true model-based 
demography which retains the core empirical character of the discipline, while using 
simulation to enhance the explanatory power of demographic research. 


9.5 Demography and the Classical Scientific Programme 


Returning once again to the pioneering, foundational work of Bacon, Graunt and 
others, we can revisit the classical scientific programme of research and illustrate 
how a model-based demography enhances this approach. In the natural sciences 
this approach is very much still in evidence, but in the social sciences we see it less 
frequently. 

In essence, in the classical scientific approach we use observations of some 
natural property, and from those observations attempt to determine the axioms or 
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laws driving the process behind this property (Courgeau et al. 2017; Franck 2002a). 
The classical programme is thus functional-mechanistic, in that we are investigating 
the process generating the property we are interested in by modelling the functions 
underlying that process. In the context of demography, for example, we set out to 
understand the process generating patterns of population change, which at the core 
are controlled by the three functions of mortality, fertility and migration. Courgeau 
et al. (2017, p. 42, original emphasis) identify some key examples of this functional- 
mechanistic approach to social sciences in the past: 


The ‘law’ of supply and demand, as another example, is the ‘first’ structure of functions 
which was inferred (induced) by Adam Smith from the observation of markets: it rules 
the process of social exchanges generating the market. Karl Marx inferred the general 
structure of functions ruling the process that generates industrial production from a thorough 
historical study of the technical and social organisation: this ‘first’ principle consists of 
separating labour and capital. Finally, Durkheim inferred the integration theory from 
a sustained statistical analysis of the differences in suicide rates between several social 
milieus: the social process which generates suicides, whichever their causes, is ruled by the 
integration of the individual agents. The application of the classical programme led to these 
prominent theoretical results at the height of social sciences. 


In these examples we see that significant theoretical advances in social sciences 
have come about thanks to the considered application of the classical scientific 
programme. Smith, Marx and Durkheim chose a social property to focus on — 
the market, industrial production, and suicide, respectively — and in each case 
used thorough observations to infer the functional structure underlying these social 
properties. Of course the impact of these inductive scientific efforts should not be 
underestimated; Marx’s Capital, for example, remains perhaps the most influential 
critique of capitalist modes of production ever written, while Adam Smith is 
memorialised in the names of free-market thinktanks the world over. 

Demography, as pointed out above, has adhered to a largely similar programme 
over its history. The observations of populations over the centuries from Graunt 
onwards has identified mortality, fertility and migration as the primary functions 
ruling the process of demographic change. Identifying these core functions has 
helped in turn to focus demographers on those social factors which contribute to 
these functions, which in turn helps identify those specific demographic variables 
which may be of greatest interest for further refinement of our understanding of 
those three functions. 

However, in recent years some have proposed that demography has strayed 
from its scientific lineage (Courgeau and Franck 2007). The power of demographic 
methods, and their widespread acceptance amongst policy-makers, has led to a 
reduced focus on theoretical innovation in the discipline (Silverman et al. 2011). 
Yet we cannot declare an ‘end of history’ in demography; the surge in interest 
in complexity and agent-based approaches since the early 2000s makes that clear. 
Demography should continue to evolve cumulatively to adapt to new challenges, as 
it has done in the past, and here we suggest that a model-based demography rooted 
in that classical scientific programme should be the next step in that evolution. 
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Figure 9.1 below illustrates the key stages of a functional-mechanistic approach to 
demography. We start, as has been the case historically, with observations of the 
properties of demographic processes (mortality, fertility and migration). Through 
detailed analysis of this data, we proceed to the second step, and attempt to infer the 
functional structure underlying those processes. From there, we seek to identify the 
social factors which contribute to those functions, which then leads us to be able to 
model that structure in detail. 

Model-based demography then uses this process as a basis for the next two 
stages. Conceptual modelling allows us to develop and construct simulations of the 
interactions at play in the demographic system of interest. The results produced 
by these simulations can help us to identify areas in which further data collection 
would be advantageous, and at that point we can start the cycle again. Thus we see 
model-based demography as an iterative process, in which each trip through this 
cycle allows us to further refine both the empirical processes and the simulation 
design and development. 

Courgeau et al. (2017, p. 44) characterise this new research programme as 
consisting of three main pillars: 


1. Adherence to the classical programme of scientific enquiry 

2. Enhancement of the ways in which demographic phenomena are measured and inter- 
preted 

3. The use of formal models, based on the functional-mechanistic principles, as fully- 
fledged tools of population enquiries. 


Thus the focus is on integrating functional-mechanistic models directly into 
the practice of demography, as a cumulative enhancement of previous paradigms. 
These models are not intended to become a replacement for previous methods, 


Observation of Inferring the Identification 
properties of underlying of contributing 
the demographic functional social factors 
processes structure (variables) 


Guidance for Computational Conceptual & 
data collection model design, mathematical 
and further execution and modelling of the 
observations analysis structure 


Fig. 9.1 The inductive functional-mechanistic approach in model-based demography (See 
Courgeau et al. 2017, p. 43) 
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nor an object of interest in and of themselves; instead, they are part and parcel 
of the demographic research process, both informing and being informed by the 
observations that form the empirical heart of the discipline. 

Courgeau et al. further suggest that demography cannot rely on other social 
science disciplines to provide key innovations (Courgeau et al. 2017). Indeed, as 
we have seen in Part II, the difficulties we encounter in the simulation of social 
systems are common to the field at large. Demography has a certain empirical 
advantage over most other social science disciplines, as well, so taking on board 
theories and methods from less empirically-focused social sciences could instead 
reduce demographers’ ability to benefit from the data-rich nature of population data. 

The discipline thus presents an intriguing example of the challenges we must 
face when developing a model-based approach that is amenable to simulation. 
While agent-based models can offer substantial power and flexibility to answer 
appropriately-posed research questions, there is a balance to be struck between 
embracing that power and ensuring that the core empirical basis and theoretical 
backdrop of a given discipline are maintained. We need to consider carefully how 
a model-based approach may move us closer to solving the core epistemological 
difficulties in demography; transforming demography into a social simulation 
discipline wholesale might help us shift away from those responsibilities, but it may 
not help us actually address them. 


9.7 Overcoming the Limitations of Demographic Knowledge 


As discussed in Sect. 9.3 above, demography faces some key limitations in its ability 
to explain demographic phenomena. One measure of the success of our proposed 
model-based demography is whether it could allow us to overcome these limitations, 
and bolster both the theoretical and explanatory capacity of demography beyond the 
limits of its current statistically-focused methodological foundation. 

The problem of uncertainty in demography has led to the emergence of new 
statistical methods within the discipline, and a general agreement that demographic 
predictions become too uncertain to be useful beyond a generation or so (Keyfitz 
1981). The use of simulation within a model-based demography could help us 
to circumvent this limitation by facilitating the use of computational models for 
scenario generation. Simulations can be used to explore the parameter space in 
which they operate, investigating how different scenarios might affect the behaviour 
of the simulated population at both the individual and population levels. While these 
scenarios would not magically present us with enhanced predictive power, they 
would enable us to present possible ways in which populations may change beyond 
the one-generation time horizon, given certain assumptions about which parameters 
are most susceptible to variation. 

Model-based demography may also be able to help demographers cope with 
the aggregation problem. While representing both the micro and macro levels of 
analysis within a simulation is far from simple, some simulation projects have 
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allowed for feedbacks between these levels (Billari and Prskawetz 2003; Murphy 
2003; Silverman and Bryden 2007). Representing these multiple levels in a single 
simulation, as well as the interactions between those levels, allows us to avoid the 
aggregation problem. However, in this context the question of which observations 
are most useful in such a complex model becomes more critical; we will revisit this 
issue in more detail in the next section. 

Finally, the problem of simplicity can also be addressed by well-considered 
simulation methods, particularly agent-based modelling. Statistical demographic 
models easily provoke a tendency toward the inclusion of ever-increasing amounts 
of data. However, agent-based simulations exhibit behaviour that is more driven by 
the choice and values of simulation parameters rather than the data which is fed 
into them. As we have discussed in Part II, in some social simulations data is of 
little or no importance (as in the case of Schelling’s residential segregation model 
Schelling 1978). Demography by its nature and its empirical focus requires more 
data input than most areas of social science, but the widespread use of agent-based 
approaches would necessitate a more careful approach to the integration and use of 
data. Failure to do so would see us struggling with issues of tractability as models 
became increasingly unwieldy and difficult to analyse; here we may wish to use 
the insights drawn in Part II from Lars-Erik Cederman (2002) and Levins (1966) 
to consider the appropriate balance of tractability versus the levels of generality, 
precision and realism required for the research question at hand. 


9.8 The Pragmatic Benefits of Model-Based Demography 


— But you are paying a lot of money for the dragon! 

— And what, should we just give it to the citizens instead? [...] I see you know nothing 
about the principles of economics! Export credit warms up the economy and increases 
the global turnover. 

— But it also increases the dragon as such — I stopped him. — The more intensely you feed 
him, the bigger he gets; and the bigger he gets, the higher his appetite. What kind of a 
calculation is it? He will finally devour you all! 


Stanistaw Lem, Pozytek ze smoka [The Use of a Dragon] (1983/2008: 186) 


As alluded to above, a common problem faced by statistical demographers 
is the pressure to bolster the empirical power of demography, or perhaps more 
properly the perceived empirical power of demography, by including ever-larger 
amounts of population data (Silverman et al. 2011). The rise of multilevel modelling 
and microsimulation approaches has made the problem even more evident, as the 
laudable goals of reducing uncertainty and unravelling the micro-macro link leads 
to an explosive growth in data requirements. 

This tendency has effects beyond just creating large and unwieldy models. The 
process of population data collection itself is both time-consuming and expensive, 
requiring the design, distribution and collection and increasingly complex surveys. 
As these surveys grow more complicated, so does the data analysis process, and 
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designing the subsequent statistical or microsimulation models becomes ever more 
difficult. 

Silverman, Bijak and Noble call this process ‘feeding the beast’ (Silverman et al. 
2011), in which demographers get caught in a vicious cycle of sorts, attempting 
to feed data-hungry models with increasing amounts of data, only to feel pressure 
to further ‘improve’ these models next time around with yet another injection of 
observations. While this process is a result of fundamentally positive motivations, 
evidence suggests that complex models do not necessarily demonstrate better 
predictive capacity than their simpler fellows, though complex models due require 
more costly data collection and would tend to have a longer turn-around time 
between new versions. 

Silverman and Bijak cite a couple of examples of this phenomenon: 


e Weidlich and Haag (1988) developed an ambitious system dynamics model of 
migration which attempted to address the micro-macro link; however, the model 
had very significant data requirements and did not fully address some of the 
complexities of the migration process itself due to the lack of individual agency 
in the model. 

e The MicMac project (Willekens 2005; Zinn et al. 2009) proposed a new method 
of dynamic microsimulation which consists of a macro portion (‘Mac’) and a 
micro-level model (‘Mic’). However, this modelling method is likewise very 
hungry for data; the ‘Mac’ portion needs detailed data on transition rates, while 
the ‘Mic’ portion requires a number of variables to be specified at the individual 
level. 


With this in mind, our proposed model-based demography should proceed with 
awareness of the problem posed by the data-hungry ‘beast’, and offer solutions that 
protect the empirical focus of demography while helping us to build models that — in 
the words of John Hajnal — “involve less computation and more cognition than has 
generally been applied” (Hajnal 1955, p. 321). In Chap. 10, we will begin to present 
some demographic models which attempt to apply these principles, avoiding ‘the 
beast’ while maintaining the empirical focus expected in the discipline. 


9.9 Benefits of the Classical Scientific Approach 


Even if we accept that simulation approaches to demography can provide us signif- 
icant benefits, both theoretical and more pragmatic, there is a danger that we may 
exchange some strengths of demography for weaknesses of simulation. We propose 
that a fully fleshed-out model-based paradigm connected to the classical scientific 
programme in demography would alleviate at least some of these concerns. 

A significant problem in the simulation approach, as outlined in Part II, is the 
complexity of social systems and thus the inherent difficulty in selecting which 
components of those systems are to be represented in simulation. Selecting these 
components generally comes about through the selection of a favoured social theory, 
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or a core set of assumptions about the functioning of the social processes under 
examination. 

The classical scientific programme helps in our drive to select the most relevant 
structures and functions which should be replicated in simulation. When under- 
taking an examination of some social property under the classical programme, we 
narrow our focus to those processes which generate that property. Our observations 
focus on that one particular social property, and using those observations we 
then seek to infer (induce) those functions which generate that property. In this 
way proceeding via the classical scientific programme helps reduce complexity 
in our modelling enterprise; our observations focus our enquiries on processes 
and functions plausibly connected to the property of interest, and this in turn 
provides clearer guidance on which particular variables must be parameterised and 
instantiated in simulation. This inductive process thus helps us to avoid the problem 
of complexity in demographic research. 

Another advantage of the classical scientific programme is its ability to generalise 
social models. The classical scientific focus on functional-mechanistic explanations 
means that we are examining social systems in an analogous way to natural and 
biological systems, in an attempt to reverse engineer the means by which a social 
process is generated (Franck 2002a). From this viewpoint, when we see a social 
process replicated in another population, for example, we can reasonably posit 
that the same generative process, and thus the same functional structure, should 
be present. In this way we are developing functional-mechanistic explanations for 
social processes which can be validated in the real world — due to the inductive 
process underlying these explanations which relies upon empirical observations in 
the first place — and which can be generalised, assuming that the iterative process of 
model refinement and data collection confirms our conclusions about the generative 
process we identified. 


9.10 Conclusions 


This chapter has been, in a sense, a whirlwind tour of the discipline of demography, 
its strengths and weaknesses, and its prospects for the future. As we have seen, 
demography is a storied discipline, centuries old and tied deeply into local, national 
and global institutions of politics and policy-making. Understanding and forecasting 
human population change is of vital relevance to any modern society, after all; 
without a clear picture of our society and where it is headed, planning for social 
policies, immigration, health services, tax structures, and so many more aspects of 
our governance become far more difficult. 

That real-world, empirical focus in demography is clearly its greatest strength; 
the rich nature of population data has allowed demography to develop into a 
methodical, statistically advanced discipline quite unlike most social sciences. 
However, these strengths have brought their own challenges, and in particular the 
three epistemological limitations in demography of uncertainty, aggregation and 
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complexity have led to significant debate within the field about the constraints of 
demographic enquiry and how to proceed (Keyfitz 1981; Xie 2000; Silverman et al. 
2011). 

Here we have outlined a model-based demographic research programme, taking 
inspiration once again from population biology developing the social simulation 
stream of research into a form that maintains the empirical richness of demogra- 
phy. The model-based programme builds upon the four existing methodological 
paradigms in demography, enhancing the power and flexibility of multilevel mod- 
elling approaches. The model-based programme is intended to be an integrated part 
of the demographic research process, allowing models to influence and in turn be 
influenced by developments in data collection and analysis. 

Model-based demography also allows us to extend our predictive horizon in 
demography, using scenario-based simulation approaches to explore areas of the 
parameter space beyond the notional one-generation time horizon. As we will see 
in subsequent chapters, exploring this parameter space in detail using methods 
like Gaussian process emulators further enables us to understand the behaviour 
of our simulations, and identify scenarios that may be of particular interest to 
policy-makers looking to plan for policy spillover effects or unexpected shifts in 
the population (Silverman et al. 2013a). The incorporation of multiple, interacting 
levels of social processes in our models can allow us to avoid the ecological and 
atomistic fallacies (Silverman et al. 2011), and better understand the interactions 
between social processes that generate key effects at the population level. The ability 
of agent-based simulations to incorporate individual-level behaviours means that 
we can also incorporate qualitative data into our models in both the design and 
implementation phases (Polhill et al. 2010), adding another avenue of empirical 
relevance to our arsenal. 

Next we will analyse some examples of simulation modelling in the demographic 
context, in order to further develop the model-based programme and identify 
productive avenues for simulation approaches to population change. In so doing 
we will discuss aspects of model analysis and uncertainty quantification and 
how they can help us avoid the problem of complex social models becoming 
intractable and impenetrable. Ultimately, demography gives us an exciting example 
of how a fundamentally classical, empirical discipline can use those strengths 
to its advantage when adopting a methodology most commonly associated with 
generative, theoretical explanations of social processes. This should serve as a 
useful model for other disciplines wishing to expand into the simulation arena while 
maintaining a focus on empirically-driven, policy-relevant research. 
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Chapter 10 
Model-Based Demography in Practice: I 


Eric Silverman, Jakub Bijak, and Jason Hilton 


10.1 Introduction 


In the previous chapter, we outlined a new research agenda for demography inspired 
by the model-based science approach in disciplines such as population biology 
(Godfrey-Smith 2006). This paradigm is constructed as a cumulative extension of 
the previous four paradigms in demography — period, cohort, event-history and 
multilevel — and makes use of computational simulation approaches, particularly 
agent-based modelling. Ultimately, we propose that model-based demography will 
allow the discipline to develop a better understanding of the complex processes 
underlying population change at both the micro and macro levels, while retaining 
demography’s characteristic empirical focus. 

However, as with any massive change in disciplinary practices, we cannot 
expect demography to shift wholesale toward a model-based approach without 
some illustrative proofs-of-concept to draw from. Learning to design and implement 
simulations is no mean feat, and as outlined throughout this volume, tackling this 
task requires in many cases a significant shift in the type of research questions one 
seeks to ask. That being the case, how might demography move forward from here? 

In this chapter we will focus on a few examples of demographic modelling which 
have successfully used agent-based modelling to add to demographic knowledge. 
We will start by discussing two well-known examples of demographic simulation, 
and transition from there into a detailed discussion of a more recent simulation 
project targeted specifically at the population sciences. 


10.2 Demographic Modelling Case Studies 


The history of demographic simulation is rather short, and in general relatively few 
demographers have made the jump to computational approaches. However, since the 
early 2000s there have been a few seminal papers which have capably illustrated the 
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possibilities inherent in these approaches, and these have influenced further work 
in subsequent years. Before diving into a detailed discussion of a recent model, we 
will outline two of these seminal papers, one of which in particular is closely related 
to the “Wedding Doughnut’ model to be discussed later in this chapter. 


10.2.1 The Decline of the Anasazi 


Axtell et al.’s 2002 model of the Kayenta Anasazi (Axtell et al. 2002) is well-known 
not just in demographic and archaeological circles, but in social simulation circles 
more generally. The model attempts to paint a historical demographic portrait of 
the Kayenta Anasazi tribe, who lived in Long House Valley in Arizona between 
around 1800 BC and 1300AD. At the end of that time, there was a precipitous drop 
in population, leading to a mass migration of Anasazi people out of the valley. 

Axtell et al. used this scenario as a test case for the computational simulation of 
agricultural societies and their cultural and economic evolution (Axtell et al. 2002). 
The model reconstructs the Long House Valley landscape and places agents within 
it, though given the limitations of the historical data available agents represent 
households rather than individuals. As is typical for an agent-based model, the 
modellers developed rules of behaviour for the agents to follow when choosing 
locations to settle and cultivate. In brief, the agents seek locations which offer 
potential for successful maize cultivation, taking into account the location of other 
resources such as water and the proximity of neighbours. Agents can change their 
location in subsequent years based on the success of their efforts in providing 
nutrition; the level of nutrition received from these crops in turn affects agents’ 
fertility. 

This simulated population produced population dynamics which reflected the 
development, and eventual reduction, of the real Anasazi. Intriguingly, results also 
show that the northern reaches of Long House Valley could have still supported 
a reduced population, even during the late stages of the Anasazi’s cultivation of 
the valley and the resultant degradation in soil quality. While the reason for this 
difference in outcomes is unclear, the results of the simulation show that in order 
to survive and stay sustainable the virtual Anasazi had to disperse into smaller 
communities; perhaps the real Anasazi were unwilling to fundamentally shift a 
settlement pattern and community organisation that had persisted for centuries, and 
chose to move on rather than make these significant adjustments. 

A subsequent replication of the model! confirms Axtell et al.’s results, and 
reconfirms that environmental factors alone cannot account for the abandonment of 


‘Unfortunately model replications are surprisingly rare in interdisciplinary simulation research. 
The constant push for ‘new’ research results de-emphasises the role of replication in the scientific 
process, which in our view is damaging to our collective efforts. A good replication can confirm or 
challenge previous results, illuminate areas for improvement, and generally help the development 
of a rigorous and well-founded model-based science. 
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Long House Valley by the Anasazi (Janssen 2009). The model results are sufficiently 
close to the historical data to suggest that the simple behavioural and environmental 
rules in place provide a plausible explanation for the demographic changes seen in 
the Anasazi population. At the same time, we see that further details are needed to 
understand why the valley was abandoned. 

Most importantly for our purposes, the explanation provided by this model takes 
us beyond what a statistical study founded in traditional demographic methods 
could have given us. By implementing an agent-based social system, we are able 
to illustrate and analyse the interactions between environmental pressures and low- 
level decisions about building homes and tending crops. Over the course of many 
simulated generations, we can see how these low-level processes drove a gradual 
migration northward through the valley as the environment became less suitable for 
farming, until finally only a small subsistence population could be supported. A 
statistical model with well-founded assumptions could have shown us the resultant 
population dynamics, but would not have facilitated this kind of detailed look at the 
low-level changes in community structure and behaviour over time. 

Further, we can see that the model follows some of the core tenets of the 
model-based approach we outlined previously. The model avoids ‘the beast’ of 
expensive and extensive data demands (Silverman et al. 2011) by making use of 
archaeological data already in use for other projects, so there was no need for 
significant expenditures of money and time on feeding vast amounts of data into 
the simulation beyond what was already available. Qualitative data drawn from 
ethnographic studies of the Anasazi was also brought into the modelling process 
to help formulate the agents’ behavioural rules. The model thus strikes a balance 
between staying empirically relevant — by using a digital reconstruction of the real 
landscape with behavioural rules developed from real data — and providing higher- 
level theoretical insight. 


10.2.2 The Wedding Ring 


Our next example of demographic simulation takes a rather different approach, 
focusing on a core concern to modern demography at a broader level rather than 
a specific historical case study. This well-known “Wedding Ring” model by Billari 
et al. (2007) attempts to bridge the micro-macro link discussed in the last chapter, 
in this case investigating the phenomenon of partnership formation. The authors 
describe the gap between the macro-level statistical explanations provided by 
demographers and the micro-level, individual studies on the process of searching for 
a partner, and view this model as an attempt to “account for macro-level marriage 
patterns while starting from plausible micro-level assumptions” (Billari et al. 2007, 
p. 60). 

The Wedding Ring uses the phenomenon of ‘social pressure’ as the under- 
pinnings of the partnership search process. In this formulation, social pressure 
originates from married agents within a social network, who then exert pressure on 
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their non-married peers, who as the pressure increases feel an increasing urgency 
to find a partner. While most of us would prefer to think our partnership decisions 
are made purely out of love and not being pressured by family and friends, there is 
evidence from social research that social pressure does have a strong influence on 
our partnership choices (Bernardi 2003; Bernardi et al. 2007; Bohler and Fratczak 
2007). 

The virtual space Wedding Ring agents reside in is deliberately highly simplistic. 
Effectively they live in a cylindrical space, consisting of a one-dimensional spatial 
component and the vertical dimension of time (Billari et al. 2007). Agents each 
have a social network of what Billari et al. refer to as ‘relevant others’, which we 
can conceptualise as a separate space overlaid on the physical space defined by 
social distance between agents. As the number of married agents within the network 
of relevant others increases, social pressure on unmarried agents in that social 
neighourhood also increases. Higher levels of social pressure directly influence an 
agent’s partnership decisions, as agents under greater pressure will search a greater 
area for their prospective partners. 

Billari et al. note that this effectively represents marriage as a diffusion process; 
however, marriage in this context differs significantly from other diffusion processes 
in that it requires another agent to participate, meaning that higher levels of social 
pressure do not guarantee that an agent will find a partner (Billari et al. 2007). 
In another simplification, though perhaps reflective of social norms, agents will 
only agree to marriage when both partners are within a certain age range. Once a 
partnership is formed, agents can decide to reproduce, which adds more agents to the 
Wedding Ring who will then undergo the same process as they age. The Wedding 
Ring model does attempt to allow for heterogeneity in agent decision-making by 
dividing agents into five different classifications according to which age ranges of 
agents they trust most. In other words, some agents most listen to the opinions of 
older agents, others listen to younger ones, and some agents listen equally to both. 

Despite the overall simplicity of the model, the Wedding Ring produces hazard 
functions for partnership formation highly reminiscent of the patterns seen in the 
real world, and sensitivity analyses performed by the authors seem to indicate that 
the results are relatively robust to parameter changes (Billari et al. 2007). In the 
context of a nascent model-based demography seeking to avoid ‘the beast’, this 
study offers some reasons to be optimistic. The model’s initial population was 
generated to mirror the age distribution of America in the 1950s, but notably this 
is the only appearance of any empirical data within the model. The simplistic world 
of the Wedding Ring is one dimensional, and the agents’ behaviours are very simple, 
and do not take into account individual agent characteristics beyond age, location 
and social pressure classification. 

The lack of data in this case does not result in an irrelevant model, however. The 
Wedding Ring is able to replicate patterns of partnership formation that are reflective 
of patterns in the real world, and in the process offer social pressure as a possible 
low-level explanation of those higher-level patterns. A statistical model of the same 
process would need substantially more investment in terms of data collection to 
provide similar insights. As we can see from the Wedding Ring, constructing our 
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agents’ behaviours on a foundation of social theory supported by evidence allows us 
to avoid that time-consuming aspect of traditional approaches, while still retaining 
empirical relevance and avoiding entirely arbitrary assumptions. 


10.3 Extending the Wedding Ring 


The Wedding Ring model, exciting as it may be, remains a proof-of-concept of sorts. 
The model provides a useful test-case for the application of relatively abstract agent- 
based models to the expansion of demographic knowledge, but there is plenty of 
room for further expansion. The exclusion of some more detailed social factors from 
the model also raises the question of whether the model results will remain robust 
when additional elements are added to more closely reflect real-world conditions. 

Here we will discuss in detail a modelling project which aimed to replicate and 
expand upon the initial Wedding Ring model.” The expanded model sought to use 
the Wedding Ring as a basis for developing a multilevel simulation which makes use 
of the strengths of both statistical demography and social simulation. In essence, the 
model takes the framework presented by Billari et al. and adds demographic data 
and predictions based on population data from the United Kingdom. Along with 
partnership formation, the model also includes a simplistic representation of health 
status, which showcases the capacity of models like this to be used as policy tools. 

We will begin by discussing the motivations and basic components of the model, 
before describing the specifics of the implementation. 


10.3.1 Model Motivations 


As we have discussed previously, demography and social simulation have been 
on a collision-course for some time. This model aims to take this process further 
by providing an exemplar model that combines several complex elements, namely 
demographic change, population health, and partnership formation. None of these 
elements by themselves are new topics for agent-based demography; a number of 
studies have focused on partnership formation, for example Billari and Prskawetz 
(2003), Todd et al. (2005), Billari et al. (2006, 2007), Billari (2006) and Hills 
and Todd (2008). However, we attempt to take things further here, by building a 
model that successfully ties together the strengths of statistical demography with 
social simulation. Such an enterprise can help the further development of model- 
based demography by demonstrating how the differing perspectives of these two 


?For significantly more detail on this project, please refer to Bijak et al. (2013) and Silverman et 
al. (2013), a pair of papers intended to illustrate the expanded model for both demographers and 
simulation researchers. This chapter is based on that work but does not include all the analyses 
present in those two papers. 
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disciplines can be reconciled into a model that offers both novelty and policy 
relevance. 

The 2003 volume by Billari and Prskawetz (2003) was influential in demography 
for presenting an approach to what they call ‘agent-based computational demogra- 
phy’, or ABCD. They present ABCD as focusing on the development of theories in 
demography, rather than focusing exclusively on prediction. We might extend this 
further, following Epstein who provided numerous examples of uses for simulation 
beyond prediction, many of which are as applicable to population change as to any 
other complex phenomena (Epstein 2008). 

This model aims to further flesh out this approach in the context of model-based 
demography. While we outlined in the last chapter some principles by which the 
empirical focus of demography and the explanatory focus of simulation can be 
reconciled, putting this into practice requires the development of exemplar models 
that demonstrate the implementation of some of these principles. Some of this 
movement is already taking place, as we see in the popularity of microsimulation 
in demography, which incorporates some elements of mechanistic explanation 
(Gilbert and Troitzsch 2005), or increasingly empirically-focused models in various 
simulation domains (see Silverman and Bullock 2004 for a broad discussion, Grim 
et al. 2012 for an example). 

Despite these incremental steps, there are still significant challenges to overcome. 
Microsimulation models are embedded in the event-history paradigm, and as such 
suffer from the excessive data demands carrying over from statistical approaches 
(Silverman et al. 2011). They also rarely include elements of spatiality, or detailed 
representations of social processes (Gilbert and Troitzsch 2005). Similarly, inte- 
grating data directly into agent-based models is a difficult process, and as a 
consequence many agent-based models rely on assumptions about agent behaviour 
and interactions between processes (sometimes to their detriment, and sometimes 
not). Here we will present a model that attempts to marry these two streams of 
work together into a cohesive whole, while also presenting some methods of model 
analysis that make the results more tractable. 


10.3.2 Simulated Individuals 


In attempting to resolve the gaps between statistical and agent-based models, it can 
be useful to reflect on the ways in which these methods conceptualise individual 
members of a population. In the case of statistical demography, we use observa- 
tions such as survey or census data in order to understand statistical individuals 
(Courgeau 2012). When we take a further step into demographic microsimulation 
models, we start considering synthetic individuals (Willekens 2005), each consisting 
of a set of transition probabilities between different possible states during their life 
course. Upon taking the final step into agent-based models, we then need to consider 
the concept of the simulated individual. 
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The simulated individual consists of a set of rules governing their behaviour, 
which must be parameterised according to our available knowledge of those 
behaviours. This knowledge may be founded entirely in empirical data, or it may 
be built on a framework of assumptions derived from well-tested social theories. In 
either case, our relationship with population data changes its character in the agent- 
based context, as we may not be able to directly translate our available data into 
parameters that govern our agents’ behaviours. 

However, taking a page from the discussion of the data-based ‘beast’, we can 
make an effort to incorporate demographic data when it is appropriate to the research 
question posed. Agent-based models don’t necessarily benefit from infusions of 
large amounts of data, which means we should incorporate data only when it adds 
to the empirical relevance of the simulation in question, not simply because it is 
desirable to do so. In order to alleviate concerns about our choices of parameter 
settings, particularly when data is missing or absent, we can use methods of 
uncertainty quantification to explore the parameter space and better characterise the 
behaviour of the simulation across a wide range of scenarios. 

Figure 10.1 below illustrates our synthesis of statistical demography and social 
simulation. In this framework, elements of statistical demography can work in 
combination with an agent-based approach. The agent-based model allows us to 
investigate possible scenarios of population change by exploring our parameter 
space, letting us look beyond the one-generation time horizon often cited in the 
demographic literature (Keyfitz 1981; Bijak 2010; Wright and Goodwin 2009). In 
turn, statistical demography contributes empirical relevance through the incorpora- 
tion of population data that can shape the development of our virtual population. 
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Fig. 10.1 Two approaches to modelling social systems: statistical demography and agent-based 
social simulation (Source: own elaboration, reprinted from Figure 1 in Silverman et al. 2013, 
drawing from Willekens 2005 and Courgeau 2012) 
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This approach thus brings together some of the key strengths of statistical demog- 
raphy and agent-based modelling that we have identified thus far. We are able to 
maintain demography’s empirical relevance and access to rich, detailed population 
data. We can use simulated agents to investigate the complex relationships between 
social factors at the micro level and macro-level population trends. By investigating 
our parameter space, we can study possible futures of population change that extend 
beyond the notional one-generation predictive horizon, putting these explorations 
into context by applying uncertainty quantification techniques. All the while we 
make sure to incorporate data that enhances our model but does not unnecessarily 
complicate our simulation; we feed ‘the beast’ only what is absolutely necessary, 
and we maintain a careful balance of tractability and realism, in keeping with a 
modified Levinsian framework. 


10.4 Extension Details 


The model presented here is an extension of the Wedding Ring (Billari et al. 2007) 
designed to demonstrate this modelling framework. The extended Ring takes the 
form of toroidal physical space rather than the one-dimensional ring of the original, 
henceforth dubbed the Wedding Doughnut. The Doughnut was developed in the 
context of a five-year research project, The Care Life Cycle, devoted to the study 
of social care in an ageing UK population; the partnership modelling aspects of the 
Ring form a useful foundation here, because understanding family structures are key 
to this area given that the majority of social care in the UK is provided by family 
members (Vlachantoni et al. 2011). 


10.4.1 Spatial and Demographic Extensions 


Our decision to expand the agents’ world into a grid-based, toroidal shape was 
driven by the hypothesis that situating agents in a ring may restrict their ability 
to form diverse networks of ‘relevant others’ during the course of a simulation 
run. The Doughnut world is square space 72 grid spaces on a side which wraps 
onto a toroidal topology, meaning agents can form networks across the vertical and 
horizontal boundaries. The initial population of agents was set at 800 individuals, 
a significant increase over the Ring’s initial population of only 100 agents (Billari 
et al. 2007). 

This major change to the virtual space the agents inhabit necessitated a number 
of significant changes to the model code. The spatial locations of agents had to 
be recorded in a different way, and calculations for spatial separation also had to 
change. A number of tests were run before settling on the initial population of 800; 
ultimately this figure was chosen as the model seemed to produce varied dynamics 
without unduly lengthy running times. Finally, virtually all model parameters had 
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to be completely re-tuned, as the defaults presented in the original paper were 
constructed specifically for the 1D ring. 

In order to connect the model more closely with real-world processes of 
population change, in keeping with the framework above we allowed the agents 
a much more varied life course and brought in statistical demographic elements.* 
In the original Ring, agents all lived to the age of 100 and died only at that 
age; in the Doughnut, we allow agents to die at any time, driven by age-specific 
probabilities from the Human Mortality database (2011). Given our interest in 
family structures for this model, we allow partnered agents to reproduce, with 
fertility rates taken from the Office of National Statistics (1998) and Eurostat (2011). 
Once the simulation advanced beyond our available population data, these rates 
were projected using bi-linear modelling methods derived from Lee and Carter 
(1992). Finally, the initial population was structured according to data from the 1951 
census in England and Wales. 


10.4.2 Health Status Component 


In order to connect this model with the urgent debates on social care provision in the 
UK and other ageing societies, we further expanded the model to include a health 
status component. This aspect is kept intentionally simplistic, as it is intended more 
as a proof-of-concept to illustrate how models of this type can be policy-relevant. 

In the health status component, agents may become ill at any given time 
according to a transition probability that increases as they age. We are restricting 
our focus here to long-term limiting illnesses which would require long-term social 
care in order to manage. Once agents contract such an illness in the simulation 
we assume they continue to display those symptoms until they eventually die. The 
transition probabilities are generated slightly differently for male and female agents, 
to account for the real-world trends in which males are somewhat more likely to 
enter a state of long-term limiting illness than females: 


p(x) = 0.0001 + 0.00041 - exp(x/16) (for males) 
p(x) = 0.0001 + 0.00039 - exp(x/14)(for females) (10.1) 


10.4.3 Agent Behaviours and Characteristics 


Agents in the Wedding Doughnut have the same characteristics as in the original 
Ring as described above, though additions were made in accordance with our 


3For more details on the demographic elements, please see Bijak et al. (2013). 
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extensions to their behaviours and the new shape of the space. Given that agents 
can move during their life course in this model, we record their spatial positions as 
coordinates on the grid. 

When agents form a partnership they are able to move on the grid to be with 
their new partner. In order to capture the effects of each partner’s social and familial 
connections, we place the new couple on the grid in proportion to the size of each 
agent’s social network of relevant others. Once the agents reproduce, any child 
agents are placed on the grid next to them; in this way we begin to see the formation 
of virtual households. Partnerships are permanent and can only form once both 
agents have reached the age of 16 or higher. In order to facilitate data analysis we 
record extensive records of every agent, including their spatial location; location and 
IDs of any partners or children; years of birth, death and partnership formation; and 
health status. 


10.4.4 Simulation Properties 


The simulation is written in the Java programming language using the Repast 
simulation toolkit. This free Java library and graphical interface is specialised for 
the implementation of agent-based models, and includes built-in functions for data 
recording, real-time visualisation, and data analysis. While the simulated popula- 
tion grows substantially during any given run, the simulation is not particularly 
demanding in terms of computing resources; any given run would take 2—3 min on 
a mid-range home desktop PC. 

The simulation runs in discrete time-steps of one year. A complete simulation 
run consists of 300 years, starting from an initial population based on 1951 census 
figures, meaning that the final simulation year corresponds to the year 2250. In 
contrast, the original Wedding Ring ran for 150 years in each run. 

Each time-step follows an identical sequence of events: 


1. Agents age one year 
2. Agents without partners: 


. Identify relevant others 

. Calculate social pressure 

. Identify potential partners 

. Form a partnership if possible 


aod Tf 


3. Agents with partners: 


a. Check fertility status 
b. If applicable, agents can produce children 
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4. All agents: 


. Check health status 

. If results indicate, transition agent to state of ill health 
. Check mortality status 

. Agents who die have year of death recorded 


aod Tf 


5. Remove dead agents from population 
. Add new child agents next to their parents 
7. Record agent statistics in output files 


On 


During the final statistics-recording step, we keep a summary file for the 
simulated population, as well as detailed logs of events from specific agents. We 
also record the results of continuous hazard-rate calculations throughout the run; 
these are further summarised every decade and at the end of the run. 


10.5 Simulation Results 


10.5.1 Population Change 


The demographic results derived from the simulation match the expected outcomes 
for an ageing population quite closely. While there are some notable differences 
due to the lack of international migration in our model, the results closely reflect the 
underlying processes of population change in this context. 

Figure 10.2 below shows a population pyramid with a direct comparison between 
the averaged outputs of 250 simulation runs and the 2011 census data from the 
UK. The simulated results follow the real results very closely, and provide some 
assurance that the simulation is accurately reflecting the core demographic processes 
of fertility and mortality (and domestic migration, if not international). Note that 
these results reflect the ‘base scenario’ of health status; the other scenarios will be 
detailed later in this section. 


10.5.2 Simple Sensitivity Analysis 


Following the example of Billari et al. (2007), we performed a simple sensitivity 
analysis of some core parameters defining the social pressure functions of the 
simulation. Figure 10.3 includes four different model scenarios: default parameter 
settings; constant social pressure that does not vary in proportion to an agent’s 
network of relevant others; a restricted spatial distance in which relevant others can 
be found; and a constant age influence, meaning that agents no longer differ in their 
responses to relevant others of differing age groups. 
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© Observed Females Q Observed Males O Males, mean O Females, mean 


Fig. 10.2 Population pyramid compared with UK census data (Source: Figure 2, Silverman et al. 
2013) 


The graphs in Fig. 10.3 are each the result of an average of ten simulation runs. 
The hazard graphs for each of the four implemented scenarios closely replicate the 
identical tests seen in the original Wedding Ring (Billari et al. 2007), though in 
the Doughnut we see agents tending to form partnerships significantly earlier than 
in the 1D case. This suggests that further tweaks of the parameter settings would 
be necessary to identify portions of the parameter space that more closely match 
partnership formation patterns in the real world. 

Further, we can see that the shape of the agents’ virtual space can influence the 
macro-level population patterns we observe. This is not unexpected, given that the 
micro-level processes we model here are heavily influenced by the spatial positions 
of agents, and those positions change over each agent’s life course as they form 
partnerships. As we will see in the next set of results, spatiality also plays a role in 
our analysis of health status. 
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Partnership Formation Hazard Rate 
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---- Constant Social Pressure 
-—- Small Spatial Distance 
—- Constant Age Influence 


Hazard Rate 
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Age 


Fig. 10.3 Hazard rates for four different simulation scenarios (Source: Figure 3, Silverman et al. 
2013) 


10.5.3 Scenario Generation Example 


One specific advantage of agent-based models in a context like the study of social 
care is that we can investigate the impact of the concept of ‘linked lives’. This is 
the idea that agents in social simulations can exhibit long-term connections to other 
simulated individuals, allowing us to investigate the impact of social interactions and 
relationships on social and health outcomes at both the micro and macro levels.* 

Traditional statistical models in demography do not have this capacity, given their 
reliance on aggregate population data in many cases. Even multilevel microsimula- 
tion models generally conceptualise their agents as individuals following personal 
life-course trajectories without the capacity for much interaction with other agents. 
This model is based on an agent-based framework, and as we will illustrate, even 
our very simple health status component allows us some rudimentary exploration of 
the linked-lives phenomenon in the context of social care provision. 

For our simple analysis, we once again collected the results of 250 simulation 
runs, this time under three health scenarios: 


1. Base Health Scenario: default values for ill-health transition probabilities 
2. Good Health Scenario: halved values for ill-health transition probabilities 
3. Bad Health Scenario: doubled values for ill-health transition probabilities 


In implementing this simple comparison, we might imagine presenting this 
simulation to a group of policy-makers seeking insight on the expected levels of 
social care demand in the UK, extending quite far into the future. In such a case, 


“For further discussion of this concept, see Noble et al. (2012) and Chap. 11 in this volume. 
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Population by Status, Simulation Year 2011 
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Fig. 10.4 Agent health status and care availability (results for 2011) (Source: Figure 4, Silverman 
et al. 2013) 


given the reliance informal social care provided by family members, we may also 
want to provide details on the expected supply of informal care under different 
possible scenarios of general population health. Even with this highly simplified 
example, we can use the simulated data to provide some level of insight into these 
key questions: 


1. How many ill agents will have access to informal care from their health spouses 
or children? 
2. What proportion of agents may have unmet care needs? 


The latter question in particular is of interest to policy-makers, as unmet care 
needs will have to be met with other resources, most probably with state assistance. 

In Fig. 10.4 we present the outcomes of these three scenarios. Using our extensive 
records of the life-course of each simulated agent, we are able to determine the 
proportions of agents who are healthy or ill, and we can further subdivide the ill 
group into agents who could be cared for by their partners, children, or who have 
unmet care needs. We note here however that this is an optimistic virtual world 
in which we assume that any ill agents with available healthy spouses or children 
will receive care; in reality, this is unlikely to be the case, as partnerships may 
break down or children may move away or have other obstacles that prevent their 
availability for care. 

As we can see from the results for each scenario, the proportion of ill agents in the 
population grows massively as the transition probabilities increase, from 9% in the 
best case to 26% in the worst. Additionally, we observe that as the scenario worsens, 
the burden on children for providing care grows faster than for spouses; this is due 
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to the increased age-specific transition probabilities for ill health making it more 
likely that spouses become ill together, rather than one being left in a sufficiently 
healthy state to provide care. 


10.5.4 Spatial Distribution of Informal Care 


We can also use our recording of agents’ spatial locations to investigate the 
distribution of healthy carers across our simulated Doughnut. Again this is a highly 
simplified example, but this gives us an illustration of how a model of this type might 
offer additional data over its statistical demographic counterparts. Using an agent- 
based methodology we are able to illustrate how agents are distributed spatially after 
simulation runs, and then draw conclusions about how those distributions vary under 
different scenario settings. In the case of social care, the availability of informal care 
within a reasonable distance could be an essential factor in determining whether an 
individual in need of care will need to seek state assistance in order to cope. 

Figure 10.5 shows the fraction of ill agents with healthy carers available within a 
given distance on the Doughnut. These values were calculated using agents’ spatial 
locations and health status across 250 simulation runs for each of our three health 
status scenarios. As we can see from the results, significantly fewer ill agents have 
access to carers in the same household (distance = 0) in the Bad Health Scenario 
than the Good Health Scenario. In a real-world context, this would tend to leave a 
greater burden on adult children to return home to provide care, as the absence of 


Percentage of Ill Agents 
of healthy spouse/child within a given distance 


Proportion of Ill Agents 


— Base Scenario 
---- Bad Health Scenario 
—- Good Health Scenario 


0 10 20 30 40 50 
Distance 


Fig. 10.5 Cumulative availability of care for ill agents by distance, simulation year 2011 (Source: 
Figure 5, Silverman et al. 2013) 
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an in-house informal carer implies that both spouses are in need of care (or that the 
ill parent is alone with their spouse having died). This has significant implications 
for policy-makers, who would need to consider how to support adult children who 
wish to make a contribution to care, or to ‘nudge’ somewhat reluctant children to 
consider helping out. For those ill individuals unlucky enough to have neither a 
healthy spouse nor healthy adult children within a reasonable distance, we would 
then need to consider supporting those individuals with state-funded formal care. 


10.5.5 Sensitivity Analysis Using Emulators 


One area of difficulty for social simulations, discussed previously in Part II, is the 
analysis of simulation results. When investigating multilevel simulations incorpo- 
rating complex interactions between agents and their environment, understanding 
the specific impact of our parameter settings can be a very difficult undertaking. As 
a consequence, researchers more accustomed to the well-established methodologies 
of statistical demography can find this aspect of agent-based modelling concerning; 
after all, if we build a fascinating model but are unable to determine what actually 
happened during a run, have we gained enough knowledge of the mechanisms at 
play to justify the time and effort involved? 

However, recent innovations in statistics have provided new ways of understand- 
ing the impact of model parameters on results, even in very complex simulations. 
Here we have followed the example set by the ‘Managing Uncertainty in Complex 
Models’ project, undertaken at the University of Sheffield until 2012, and used 
Gaussian process emulators to examine the impact of simulation parameters in the 
Wedding Doughnut. 

We will only summarise the approach briefly here, as a certain level of intensive 
statistical exposition is required to explain these emulators in detail; for a more 
detailed examination in the context of this specific model, please see Silverman et 
al. (2013). In essence, a Gaussian process emulator creates a statistical model of the 
computational model, known as the simulator, and then decomposes the variance 
of our main output variable of interest into a constant term, a series of main effects 
related to our input parameters, and interaction effects between those parameters 
(Oakley and O’ Hagan 2002, 2004; Kennedy and O’ Hagan 2001). Effectively we are 
then left with an illustration of the amount of output variance that can be accounted 
for by each of our input parameters. 

Kennedy then took this method further (Kennedy 2004), noting that simulations 
may contain additional variability due to uncertainty within the computer code itself. 
This can be accounted for through an additional term, the nugget, which addresses 
this additional uncertainty. 

In order to implement a Gaussian process emulator for our simulation results, we 
identified four main input parameters: œ and f, two parameters defining the social 
pressure function as formulated in the original Ring model (Billari et al. 2007); c, 
the scaling term in our transition probability function (set at 0.00041 and 0.00039 in 
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Fig. 10.6 Share of ill agents with no available carers (Source: Figure 6, Silverman et al. 2013) 


Eq. 10.1); and d, the spatial distance in which partner search is allowed. Our output 
of interest is defined as the share of ill agents without access to a healthy spouse 
or adult children. We then run the simulation 400 times at a range of values for 
these four key parameters, and recorded the final output values for each run. We 
input those results into the free software GEM-SA (Gaussian Emulation Machine 
for Statistical Analysis) by Kennedy and O’Hagan to run this initial emulator, and 
the results of 41,000 emulator runs are shown in Fig. 10.6 in the form of a heat map. 

The emulator produced a mean of 55.4% of ill agents with no access to informal 
carers; note however that this mean is generated from results across the entire 
segment of the parameter space, and many areas of this space will have very 
unrealistic values of the four input parameters. The vast majority of the variance in 
the final output values originated from @ and £, at 29.8% and 48.6% of the variance 
respectively. 

In order to more clearly illustrate the link between partnership formation and 
informal care provision in the model, we implemented a second emulator, this time 
using the share of agents who partnered over the course of the simulation as our final 
output of interest. In the model an agent’s health status has no impact on partnership 
decision-making, so in this iteration we only used qa, f, and d as input parameters. 

Figure 10.7 illustrates the results, again the form of a heat map. The values of 
a and 6 are again accounting for most of the output variance, 30.2% and 45.6% 
respectively, and 17.7% of the remaining variance is due to the interaction of those 
two terms. 


206 10 Model-Based Demography in Practice: I 


Proportion of Agents Over 16 Ever Partnered 


In(Beta) 


Fig. 10.7 Share of agents who have ever been married (Source: Figure 7, Silverman et al. 2013) 


Taking these two visualisations together, we see a clear correspondence in the 
parameter space between high partnership formation levels correlating to lower 
unmet care need, and low partnership formation correlating to high unmet care 
need. The emulator thus serves as a comprehensive analysis of a segment of the 
simulation’s parameter space, allowing us to investigate an enormous range of 
possible scenarios with a relatively small number of simulation runs. In the context 
of more complex simulations, we can easily see how the Gaussian emulator method 
puts our input parameter values in context and allows us much greater insight into 
their impact on simulation outputs. 


10.6 The Wedding Doughnut: Discussion and Future 
Directions 


As mentioned above, these results are intended as a clear and simple illustration of 
how even relatively abstracted models informed by real-world population data can 
answer policy questions that traditional demographic methods struggle to address. 
Even in our simplistic case, we are able to discuss possible scenarios of population 
change and population health at both the micro and macro levels. Through detailed 
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study of simulation results, we are able to explore the possible consequences of 
policy changes, and draw some conclusions about where potential policy solutions 
are most urgently needed. 

The model results here also indicate that spatiality plays an important role in 
the function of these social processes at the micro level. While there may be 
circumstances in which modern technology such as telemedicine systems can be 
of some help, in the case of social care we are studying individuals who need 
direct, hands-on help with simple activities of daily living, from changing clothes 
to going to the bathroom. As such, in cases like this where direct interaction is 
paramount, these simulations underline the importance of considering spatiality 
when making predictions regarding care demand and supply. Fortunately, such data 
is often available through the census or large-scale survey studies, but we would 
suggest that agent-based models are better able to directly represent the impact of 
spatiality then statistical, population-level models. 

Of course there are clear areas where this simulation falls short of being realistic. 
As we outlined in previous chapters, this is not necessarily a negative in cases 
where we are investigating general social theories, but in the case of a policy- 
relevant area like social care, additional details implemented sensitively would 
enable us to investigate additional important social factors that can significantly 
impact the demand for and supply of social care. In this model we have only a 
very simplistic representation of ill health, for example, whereas in the real world, 
different individuals will have highly differentiated levels of care need, which will 
demand different amounts of investment from their families or the state. Partnerships 
also continue for life in this model, which doubtless inflates the numbers of agents 
having access to healthy carers; in reality partnership dissolution is commonplace, 
which could very well have a significant effect on the levels of unmet care need in 
the population. 

In the next chapter, we will discuss a simulation which leaves the Wedding Ring 
framework behind, and adds additional levels of detail for each agent’s life-course. 
In this model, partnerships can both form and dissolve, health status is not simply a 
binary state of ill-or-healthy but involves five increasing states of severity, and agents 
can migrate out of the family home for a much larger array of reasons than just 
finding a partner. We will also make use of more sophisticated methods of sensitivity 
analysis to investigate the model’s parameter space. 


10.7 General Conclusions 


In this chapter we have examined some practical examples of the application 
of agent-based social simulation methods to the discipline of demography. Our 
starting examples of the modelling of the Anasazi abandonment of Long House 
Valley (Axtell et al. 2002) and Billari’s examination of partnership formation in a 
simple one-dimensional space (Billari et al. 2007) illustrated some of the benefits 


208 10 Model-Based Demography in Practice: I 


of agent-based methods for the generation of new demographic knowledge. These 
two models demonstrate how model-based demography has the potential to answer 
questions that traditional statistical demography cannot, and how avoiding ‘the 
beast’ of demographic data demands can allow us to maintain tractability even in 
models of complex processes of population change. 

Our detailed examination of the extended Wedding Doughnut model of part- 
nership formation and health status takes additional steps toward integrating the 
richness of demographic data with the flexibility and power of simulation. This 
model has a fundamentally simple foundation, providing the essential processes 
necessary to generate a realistic illustration of an ageing population based on UK 
census data. By integrating statistical demographic data and predictions into this 
model, we are able to ensure the results remain relatively realistic; however, the 
amount of data incorporated into the model is small by the standards of multilevel 
approaches in demography, as we leave the partnership decision-making within the 
model to the simple social pressure function derived from research results in the 
social sciences. 

A further consequence of this approach to modelling is that we are able to 
investigate the actions of agents embedded in both physical and social spaces. 
This opens the door to better understanding of the impact of linked lives — the 
substantive connections between individuals that can have both social and physical 
consequences. The Wedding Doughnut model presents a simple example of how 
these social connections and agents’ spatial distribution interact in the context of 
social care, in which these two aspects are critical in driving the demand and supply 
of social care in an ageing society. 

As with any simulation of social systems, we are left with substantial archives 
of data after running the model many hundreds of times; in this particular case we 
generated several gigabytes of detailed logs of every agent action across numerous 
possible scenarios. However, by implementing a probabilistic sensitivity analysis 
using Gaussian process emulators, we are able to develop a more systematic 
understanding of the impact of each key input parameter on the final output variance. 
In turn, we are able to insulate ourselves somewhat from the analysability and 
tractability concerns so often cited by those skeptical of simulation approaches. 

Thus, we can begin to see how model-based demography can take shape, and 
take advantage of the empirical richness of demography and the investigation of 
complexity facilitated by agent-based modelling approaches. However, the model 
demonstrated here remains relatively simple, and while we were able to generate 
useful illustrative results from this effort, establishing model-based demography as 
a substantive paradigm within the discipline will require testing these principles and 
methods with more robust simulations. 

In the next chapter, we will present another simulation study which takes the 
simulation aspects to a higher level of detail, while still remaining highly tractable 
and analysable using the same tools outlined here. From there we will take stock of 
the lessons learned from these early explorations in model-based demography, and 
outline how future investigations within this paradigm might proceed. 
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Chapter 11 
Model-Based Demography in Practice: IT 


Eric Silverman, Jason Noble, Jason Hilton, and Jakub Bijak 


11.1 Introduction 


The previous chapter provided some examples of the practice of model-based 
demography. We were able to develop a simulation which, despite significant 
simplifications, was able to illustrate some of the core concepts of model-based 
demography. The model captured the core demographic processes and their role 
in partnership formation, while remaining free of excessive and expensive data 
demands. Using Gaussian process emulators allowed us to investigate the impact 
of key model parameters, making the model more tractable and more useful as a 
potential policy-making tool. 

Now we will examine another simulation which builds upon these foundations. 
By increasing the complexity of the modelled agents and focusing on a specific 
policy question, we will illustrate how agent-based modelling combined with 
statistical demographic projections can create a useful platform for experimentation 
with social policy. Once again the use of Gaussian process emulators will provide 
further insight on our results, demonstrating the unexpected interactions that can 
occur in complex, interlinked social processes. 


11.2 Model Motivations 


This modelling project came about as part of the EPSRC-funded “Care Life Cycle’ 
project at the University of Southampton, which ran from 2010 to 2015. This project 
sought to develop innovative means for predicting the eventual supply and demand 
of social care in the United Kingdom, using methods drawn from complexity science 
(Brailsford et al. 2012). The project team consisted of a core group of academics 
drawn from a variety of disciplines, including demography, gerontology, operations 
research, and agent-based modelling. 
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As with many industrialised nations, the age structure of the population of the 
United Kingdom has been shifting over recent decades. Birth rates are low and 
life spans continue to lengthen, resulting in an ever-increasing demand for social 
care services (Raphael Wittenberg et al. 2051). This trend is becoming increasingly 
severe as the supply of social care also begins to decrease due to the ageing of the 
care workforce (Coleman 2002). 

Projections of future care demand and supply have for the most part been 
performed using statistical methods, as in the citations above. However, the 
Care Life Cycle acknowledged that the provision of social care is inherently a 
complex process, involving factors present at multiple levels of society. Within 
families, informal social care decisions can involve aspects of partnership status, 
socioeconomic status, social networks, and more; similarly, formal care involves 
issues of the cost of care provision, the availability of carers and migrant workers, 
and complicated links between public and private providers. A full understanding of 
these processes the dynamics of the interactions between them requires us to move 
beyond statistical approaches alone and apply the methods of complex systems 
science (Silverman et al. 2012). 

In this model we address this critical area of public policy using an agent- 
based model combined with demographic projections. As we saw in the previous 
chapter, this kind of approach allows us to take advantage of the power of agent- 
based models and their ability to represent complex processes while keeping our 
results closely aligned to empirical data and real-world demographic processes. By 
generating possible scenarios of future social care demand and supply, and analysing 
the results using uncertainty quantification methods, we can better understand the 
effect of the social factors and processes underlying this problem and develop an 
effective platform for testing potential policy responses. 


11.3 The ‘Linked Lives’ Model 


This model is an agent-based, spatially-embedded platform in which simulated 
individuals live out their life-courses in a virtual space roughly modelled on UK 
geography (Silverman et al. 2013). The version presented here is an extension of 
a previous model, dubbed the “Linked Lives’ model, which demonstrated the core 
components of this framework (Noble et al. 2012). 


11.3.1 Basic Model Characteristics 


The simulation was written from scratch in the programming language Python, 
chosen both for its ease-of-use even for inexperienced programmers and the 
availability of convenient libraries for data analysis and plotting (Noble et al. 2012). 
Python is an interpreted rather than a compiled language, meaning that simulation 
runs are not optimised for any specific CPU architecture and thus slower; however, 
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the average simulation run can still finish in approximately 1—1.5 min if real-time 
visualisations are turned on, or 20-30 s if they are turned off.! The simulation code is 
freely available and is provided with the MIT Licence, which is the most permissive 
open-source licence.” 

Agents in our model occupy a rough analogue of UK geography divided into a 
grid which consists of virtual ‘towns’ each containing up to 625 households. The 
number of households allocated to each town is determined according to the local 
population density, which is set in the initial conditions of the model to reflect the 
approximate population density of the UK. Agents represent individuals and can 
have differing individual characteristics including age, sex, location, work status 
and health status; the health status aspect of the model will be detailed further below. 
The simulation operates at roughly a 1:10,000 scaling factor compared to the real 
UK population, meaning that an agent population representing the UK at the time 
of writing would consist of approximately 6,400 agents. This scaling was done in 
order to reduce the computational requirements of the simulation, as the activities 
of 60 million agents or more would take very large amounts of computer time to 
calculate. 

Agents live out their life-courses in one-year time steps, at the end of which their 
movements and changes in status are calculated and recorded in detailed log files at 
both the individual and population levels. In order to allow time for the simulated 
population to ‘settle’, we begin each run in the year 1860. A simulation run ends in 
the year 2050, at which point we recorded the projected values for social care cost. 

The agents are capable of forming and dissolving partnerships in the model, 
though note that this simulation does not use the social pressure model of part- 
nership formation outlined in the previous chapter (please consult Noble et al. 2012 
for the specifics of this aspect). For the purposes of this simulation we only model 
partnerships as relationships that can produce offspring. The marriage market for 
agents is nationwide, and agents will be paired up if they both meet one another’s 
partnership criteria and if neither already has a partner. Partnership dissolution is 
a simple process driven by age-specific probabilities of the male agent leaving the 
partnership. 


11.3.2 Health Status Component 


In contrast to the Wedding Doughnut, agents in the Linked Lives model are able 
to transition into varying levels of social care need. Every year agents are checked 


'These runs were all performed on a 2009-era personal desktop computer with a 2.8 GHz i7 quad- 
core processor, 12GB RAM and a 7200 RPM hard drive. We expect runs would be significantly 
faster on a more modern machine with a solid-state drive and more up-to-date CPU architecture. 
The code can be accessed and downloaded here: https://github.com/thorsilver/ABM-for-social- 
care 
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Table 11.1 Care need Care need category | Weekly hours of care required 


categories 
None 0 
Low 8 
Moderate 16 
Substantial 30 
Critical 80 
Table 11.2 Care provision Agent status Weekly hours of care available 


tor cach agent category Dependent children 5 


Adult living athome |30 
Retired agents 60 


against the age- and sex-specific probability that they may transition into a state 
of care need. Agents who are already in a state of need may worsen in future 
transitions, but we assume agents do not improve again once they have entered a 
state of long-term limiting illness. The different levels of care need and the hours of 
care required for each are in Table 11.1 below. 

Unlike the Wedding Doughnut’s simplistic model of health status, in Linked 
Lives the provision of informal social care is modelled explicitly. Each agent is 
capable of providing care according to their own health status and the amount of free 
time they have available, which depends on their stage in the life-course and their 
work status, as seen in Table 11.2 below. Note that agents who are ill themselves 
can provide care for others, but only if they themselves require ‘Low’ levels of 
care. In addition they can only provide half the hours of care normally provided by 
someone at their stage of the life-course and work status. The simulation assumes 
that agents will provide informal care if they have time available to any member of 
their household that enters a state of care need; we do not model varied levels of 
willingness to care, though this is a target for a future expansion of this simulation. 

The social care aspect of the model is further simplified by avoiding an explicit 
representation of formal care, such as care homes or inpatient treatment. The model 
adds together any unmet care needs of individual agents and ‘charges’ these to the 
state at a rate of £20/h to arrive at the final figure of social care cost in each time 
step. The intent of the model is not to predict social care costs accurately, but instead 
to demonstrate the relative impact of different policy solutions on that figure, so 
in that sense the realism of that figure was not important to the final analyses to 
be performed in this version (and obtaining the data for these figures would be a 
complex undertaking all of its own). Note also that the simple economic model 
present here does not include inflation, so the final costs would actually be somewhat 
higher than what is presented here. 
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11.3.3 Agent Behaviours and Demographic Elements 


As agents age during the course of a simulation run, they can undergo several 
different life-course transitions, each of which is recorded and can affect other 
aspects of the simulation. When agents are born they are classed as dependent 
children, and are thus able to provide only a small amount of care, as outlined above. 
Once agents reach the age of 17 they are considered adults and are able to enter 
the workforce and begin paying tax into the system. During adulthood agents may 
choose to live at home with their parents or move out on their own. At the default 
settings, agents cease working and become retired at age 65, but as we shall see 
later, changing this parameter has some interesting effects. 

The first Linked Lives model used a Gompertz-Makeham model of mortality, and 
fertility rates were a flat probability of reproduction for any partnered female agents 
of childbearing age (Noble et al. 2012). In this revision we linked these aspects with 
empirical data by making use of demographic data starting in the year 1951 (the date 
of the first UK census) (Silverman et al. 2013). As with the Wedding Doughnut, we 
used age-specific mortality rates from the Human Mortality Database (2011), and 
fertility rates from the Office for National Statistics (1998) and Eurostat (201 13 

We then used the techniques pioneered by Lee and Carter (1992) to calculate 
projections for these rates until the end of the simulation in 2050. In the case of 
mortality, these projections show continual increases in lifespan throughout the 
simulation period, though the rate of increase slows over time. In the case of fertility, 
we See the rates converge at just above replacement fertility, and a continuing trend 
toward later childbearing.* 

The Linked Lives model represents migration in greater detail than the Wedding 
Doughnut. Agents are able to migrate under several different conditions. When 
agents form a partnership, they may choose to move into their partner’s home, which 
may still contain members of that partner’s family. They may also choose to move 
elsewhere and start a new household, which can be in the same town as one partner 
or an adjacent one. Agents may also move independently without a partner; once an 
agent reaches adulthood there is an age-specific probability that they may make this 
move in a given year. 

Some agents may also choose to move arbitrarily, rather than in response to a 
change of status; this is intended to be reflective of other individual life choices, 
such as changing careers or simply desiring a change. Agents who opt to dissolve a 
partnership will in turn dissolve their household, with the male agent moving to 
a new random household, while the mother retains custody of any child agents 
resulting from their partnership. On those rare occasions when both parents die 


3The mortality rates from HMD run from 1951-2009, while the ONS fertility rates are used from 
1951-1972 and Eurostat rates from 1972-2009. 

4Mortality rate projections used a singular value decomposition matrix of centred mortality rates. 
Fertility rate projections used two components of the singular value decomposition matrix in order 
to best capture fertility trends. 
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before their children reach working age, the children are adopted by a random 
household and move to join them. 

Upon reaching retirement, there is a small chance that agents may choose to 
move in with one of their children. This was intended to represent the choice of 
some retirees to move in with their children when they start suffering more health 
problems. This aspect is investigated further in the Results section. 


11.4 Results 


Given the Linked Lives model’s focus on providing a platform for investigating pol- 
icy, analysis of simulation results focused on exploring the effects of key parameters. 
Several parameters were designed to replicate proposed policy solutions, including 
increasing the retirement age to increase tax receipts, and encouraging aged parents 
to move back in with their adult children to receive care. Prior to a more in-depth 
sensitivity analysis, a series of parameter sweeps were run to investigate the impact 
of these parameters on the cost of social care per capita in simulation year 2050. 


11.4.1 Parameter Sweeps 


Each of the graphs in this section uses an identical format, with the final per 
capita social care cost as the vertical axis, and relevant parameter values along the 
horizontal axis. The displayed values for each parameter setting are the mean final 
output values for ten runs at that setting. 

Figure 11.1 focuses on a parameter which defines the probability in any given 
year that an older parent may choose to move back in with their adult children. 
When building the simulation, we had theorised that encouraging older parents to 
move in with their children would reduce care costs by making it more likely those 
parents would have access to formal care. As we can see in the graph, however, 
even when the probability was eight times higher the social care costs were nearly 
identical. Ultimately it seems that the share of older agents taking this option was 
not large enough to produce a noticeable reduction in cost. 

In Fig. 11.2 we see the impact of changing the hours of care that retired agents 
are able to provide for loved ones. Here there is a significant impact on cost as the 
availability of care increases; retired agents are the most likely to live with another 
agent, generally a spouse, who is also older and more likely to need care. Thus 
when these agents are able to give more, a relatively large amount of care need is 
then taken on by this group rather than passed on to the state. 

Next we have an unsurprising result in Fig. 11.3, which shows the final social care 
costs resulting from increasing probabilities of agents transitioning into a state of 
care need. As this parameter essentially decides the overall health of the agent pop- 
ulation, increasing the likelihood of care transitions has a dramatic impact on cost. 
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Fig. 11.1 Results of five 
ten-run series of the model 
for different values of the 
parameter controlling the 
likelihood of parents moving 
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Finally, Fig. 11.4 shows the impact of increasing the agent retirement age. Costs 
increase significantly when the retirement age is reduced, which fits results seen in 
the real world. However, while increasing the retirement age does reduce cost to 
a point, the effect levels off beyond age 70. This suggests that keeping agents in 
the workforce longer does help to some degree, but it also reduces the availability 
of informal care for older people significantly, as agents who become sick may no 
longer be able to fulfil their care needs by living with a retired spouse with time to 
spare. 

Interestingly, this result is consistent despite the simulation lacking any means 
for representing the possible health impact of working later in life. In that respect 
the model may actually be presenting an optimistic picture in this case. A future 
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Fig. 11.4 Results of five ee 
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Fig. 11.5 Filled contour plot displaying model outputs for ranges of values for base care 
probability and retirement age. High probability of care need transitions combined with low 
retirement ages produces extremely high social care costs 


version of the model will examine this aspect more closely and incorporate more 
detail in this critical part of the life course. 

The contour plot in Fig. 11.5 provides a synthesis of sorts, illustrating the impact 
of retirement age and the base probability of care need transitions on the final social 
care cost. Scenarios featuring a low retirement age and generally poor population 
health produce extremely high per capita care cots, reaching well over 20,000 per 
year. On the other end of the scale, the lowest costs appear in scenarios with high 
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population health and high retirement age; in these scenarios we have a larger 
proportion of healthy agents able to provide care, as well as lower overall numbers 
of agents needing care in the first place. 

Figure 11.5 serves as a useful illustration of how the investigation of a model’s 
parameter space can help us to explore possible population scenarios. By running 
the model several hundred times across a range of values for these two parameters, 
we are able to see clearly how these two values interact. We would not want to use 
these scenarios as firm predictions, but when dealing with a problem as complex as 
social care supply and demand these scenario explorations help us to understand the 
interactions between the policy instruments we have available. 

The parameter sweeps above show that even in a simplified model the processes 
underlying social care supply and demand can interact in unexpected ways. The 
retirement age investigation in particular illustrates that the model actually produced 
a surprising insight: that a significant amount of savings are generated for the public 
purse through the care provided by elderly relatives. These results have since been 
confirmed by a Carers UK and Age UK report and subsequent follow-up research 
into what they call the ‘invisible army’ of older carers who save the UK an estimated 
5.9 billion a year on social care costs (Age 2016, 2017). The results from this model 
pre-date that report by three years, which suggests that well-constructed agent-based 
models can help us to identify and anticipate significant trends in population health 
even before relevant data collection has occurred. 


11.4.2 Sensitivity Analysis with Emulators 


The complex, interacting processes at play in the social care system are represented 
in a relatively simplistic way in this simulation, but as we have seen, the results 
still demonstrate some interesting dynamics. In order to better understand the 
relationships between these four key model parameters, a Gaussian process emulator 
(O’ Hagan 2006) was used in a similar way to the Wedding Doughnut model. 

In this case, the final output of interest is the final social care cost per capita 
in simulation year 2050. The four input parameters discussed above were used in 
the emulator, and a range of parameter values were chosen to provide a reasonable 
section of the parameter space to investigate. Once again we followed the example 
of Kennedy (2004) and used the additional ‘nugget’ term to account for additional 
uncertainty generated by the program code itself. 

In order to provide a substantial dataset for the emulator, multiple runs of the 
simulation were performed at all possible combinations of the chosen parameter 
values. This resulted in approximately 1,300 simulation runs, the outputs of which 
were condensed into a pair of text files and fed into the GEM-SA (Gaussian 
Emulation Machine for Sensitivity Analysis) software (Kennedy 2004). 

GEM-SA produces two primary forms of output, displayed here in Figs. 11.6 
and 11.7. In Fig. 11.6 we have a summary of the percentage of the final output 
variance accounted for by the parameter named in the left-hand column; the top 
four rows show the variance resulting from the four main parameters individually, 
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Fig. 11.6 Visualisation of Input name Variance (%) 
sensitivity analysis performed 
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Fig. 11.7 Results of Gaussian Process Emulator demonstrating the impact of each parameter on 
final output values. The output value is the mean cost of social care per taxpayer per year at the 
end of the simulation in year 2050 (Source: GEM-SA software (own calculations)) 


while the six rows beneath record the impact of the interactions between pairs of 
those parameters. The ‘Base Care Prob’, or base probability of transitioning to a 
state of care need, clearly has the most substantial impact by far, accounting for 
88.9%; a generally healthy population produces generally lower costs, and vice 
versa. Conversely, the probability of parents moving in with their children (listed 
as ‘Parents Moving In’) has a very small impact on care costs, accounting for only 
0.01% of output variance alone and a further 0.02% in interaction with other factors. 

However, once again retirement age plays a larger role than anticipated. Some 
8.06% of the output variance is accounted for by the effect of the retirement age 
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parameter, with a further 2% in interaction effects. This is significantly higher than 
the impact of the number of hours of care provided by retired agents (‘Retired 
Hours’), at 0.88% and 0.17% respectively. Figure 11.7 illustrates the impact of the 
four parameters on final output variance, and once again this visualisation confirms 
the surprising impact of retirement age on per capita social care cost. 


11.5 The Power of Scenario-Based Approaches 


The results of this simulation demonstrate the power of agent-based modelling when 
applied to a policy-relevant research question. By augmenting an abstracted spatial 
model of social care (Noble et al. 2012) with demographic data, we are able to 
investigate the impact of possible policy changes on social care costs. While the 
model lacks significant details regarding some of the specific processes and social 
factors underlying the social care system in the United Kingdom, the dynamics were 
captured well enough to produce results that highlighted the role of older carers in 
the system, an aspect of the problem that has only begun receiving attention very 
recently (Carers and Age 2015; Age 2016). 

By modelling both the supply and demand of care in the system, the model also 
allows us to study the impact of policy changes and anticipate possible unintended 
consequences of these changes. In this case, we can see that while shifting the 
retirement age upward does prompt significant reductions in care costs, this is not a 
bottomless well of additional tax revenue. Once we raise the retirement age beyond a 
critical point, the savings evaporate as older carers are then less available to provide 
informal care for their loved ones, the cost of which is then passed on to the state. 
Future iterations of the model could produce more specific recommendations in 
this respect by explicitly modelling the health impact of both longer careers and 
significant caring responsibilities amongst older people. As noted by Age UK, older 
carers are more likely than non-carers to report their health as ‘not good’, and 
are more likely to report being anxious or depressed (Carers and Age 2015), so 
modelling the pressures on this critical group can help us to better target support 
systems designed to help carers cope. 

Speaking more broadly, this model takes us another step down the road toward 
a model-based demography. By enhancing a relatively simple agent-based model 
with detailed mortality and fertility projections derived from real data, we are able 
to construct a simulation that integrates the strengths of statistical demography with 
the flexibility of simulation, all without relying on excessive amounts of empirical 
data. Further refinements to the approach will enable us to connect our models more 
closely to empirical data, enhancing the predictive capacity of the models, while also 
maintaining the critically important balance between realism, generality, precision, 
and tractability. 

The model can certainly be criticised for not providing point predictions of 
specific cost estimates, or fine levels of detail for different parts of the UK. However, 
the theoretical backstory of the model is designed to justify its use as an exploration 
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of possible scenarios related to social care cost in a simulated ageing population. 
The lack of predictive capacity in this case is a conscious design decision, as we 
are seeking an understanding of the relative impact of certain factors on care costs 
and an exploration of possible policy impacts, not an economic forecast based on 
highly-detailed datasets. So while we may not be able to offer these results to the 
Department of Health as unambiguous recommendations of specific actions to take, 
even this simple model can produce some useful insights regarding the limited 
benefits of increasing retirement age in response to the pressures of an ageing 
population with growing social care needs. 

Future expansions of this model could focus on aspects such as international 
migration, a particularly important element in this context as demographic research 
has shown that policies aimed at replacement migration and increasing birth rates 
can help reduce the impact of ageing trends (Bijak et al. 2008). Additional details 
on the pressures facing older carers, including the impact of working in later life, 
would help to generate a more detailed portrait of that critical part of the informal 
care structures evident in UK society. Initial versions of the model were intended 
to include some effects derived from socioeconomic class, which is well known to 
influence health outcomes for older people (Majer et al. 2011), but this was removed 
for the sake of simplicity; future versions could include this aspect to allow the 
model to reflect the complex impact of health inequalities in the UK. 

In summary, the model provides an early-stage example of the value of scenario 
exploration in the context of policy-relevant demographic models. The use of 
Gaussian process emulators further enhances the model’s power by allowing us to 
pick apart our results and identify the relative impact of key parameters on the final 
output of interest. This kind of scenario-based approach allows us to look beyond 
the one-generation time horizon and anticipate the possible long-term impacts of 
population and policy changes. Future versions of this model will continue to push 
this approach forward by modelling more relevant social factors. 

In the next chapter we will take stock of our progress in model-based demogra- 
phy thus far, and discuss some key challenges that face the field in the future. 
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Chapter 12 
Conclusions 


12.1 Model-Based Demography: The Story so Far 


As we have seen from the examples presented in Part III thus far, the development 
of model-based demography is advancing, though still in a relatively early stage. 
Demography is a field with a lengthy and successful history, and by the nature of 
the research questions it poses, has always been linked very tightly with concepts 
of statistics and probability (Courgeau 2012). Far from preventing progress in the 
field, demography has been influential since its earliest beginnings, proving vital 
for the study of population change and the implementation of critical social policies 
relating to its core processes of fertility, mortality and migration. 

Perhaps as a consequence of this, the introduction of a new methodology into 
a field with such an extensive history is one that bears careful consideration. After 
all, if statistical modelling of population data can still produce figures that journal 
editors, policy-makers, and research funders find useful and are happy to support, is 
there any particular need for new methods? 

As outlined in Chap. 9, demography has wrestled with this question before, and in 
each instance has incorporated these new methods and used them to enhance its core 
strengths. Each of the four methodological paradigms we outlined — period, cohort, 
event-history and multilevel — developed in response to shortcomings in previous 
methods that left certain demographic questions out of reach. New methods that 
addressed these questions thus became part of the demographic toolbox, augmenting 
the field’s capabilities but not replacing the methods that came before. 

In that respect, model-based demography answers a similar call. The epis- 
temological challenges posed by the problems of uncertainty, aggregation and 
complexity lend themselves toward a model-based approach. The application of 
simulation to demographic problems is not just in pursuit of novelty, but is a means 
to an end, offering the power to investigate and better understand the complex, 
multilevel interactions that drive population change. As with cohort, event-history, 
and multilevel approaches previously, model-based demography will become a key 
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tool the demographic toolbox for certain categories of research questions, and the 
four previous paradigms will continue to be useful for other questions. Indeed, as 
we have seen in the examples in Chaps. 10 and 11, simulation models and statistical 
demography can work in combination to answer questions about complex social 
processes. 


12.2 The Practice of Model-Based Demography 


The modelling studies presented in Part III provide detailed examples of the 
application of simulation modelling methodologies in demographic research. We 
should note that the studies presented here are far from a comprehensive survey 
of the whole of model-based demography, though we have tried to present some 
of the key influences on these studies and acknowledge their critical contribution 
to the state of the field today. Without the efforts of Axtell et al., Billari et al., 
and numerous others, the development of model-based demography would not have 
proceeded so quickly and effectively (Axtell et al. 2002; Billari and Prskawetz 2003; 
Billari et al. 2007). 

The Wedding Ring example in particular serves as a useful illustration of 
how simple models can add to demographic knowledge. Much like Schelling’s 
residential segregation model (Schelling 1971), the Wedding Ring focuses on a 
single complex question and seeks to model a possible answer with very little in 
the way of data or complexity. In both cases, we are not able to offer specific 
point predictions about the future state of residential segregation or coming trends 
in partnership formation, but we are able to offer some new conclusions about 
the social functions underlying those phenomena. With Schelling’s model, we can 
suggest that individual racial preferences may play an unexpectedly large role in 
the manifestation of residential segregation, and with the Wedding Ring we can 
propose that social pressure on the unmarried from the married can produce similar 
partnership formation patterns to those seen in the real world. Most fascinatingly, 
neither of these models required the incorporation of even a scrap of real-world data. 

These modelling approaches thus hew closer to systems sociology as opposed 
to social simulation (Silverman and Bryden 2007, to appear). The outcomes of 
these models help us to understand the social functions underlying a population- 
level outcome seen in human society, but in neither case are we able to state 
anything specific regarding a real-world example. This is a marked difference from 
traditional demography, in which the overwhelming majority of research is applied 
in nature. 

The Wedding Doughnut (Bijak et al. 2013; Silverman et al. 2013a) acknowledges 
this disconnect and attempts to resolve it, primarily through the integration of statis- 
tical demography. The model is extended to increase the influence of spatiality, and 
fertility and mortality, augmented by statistical demographic projections using real- 
world data, become key elements of the model’s behaviour. The model remains a 
proof-of-concept more than a focused policy tool, but these extensions demonstrate 
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the capacity for simulation and traditional demography to work in tandem, and in 
the process harness the strengths — and mollify the weaknesses — of each approach. 

The addition of uncertainty quantification in the form of Gaussian process 
emulators further eases the transition from statistical model to computational. The 
emulator provides key insight into the impact of simulation parameters, helping the 
modeller to redress the balance between precision and tractability. In practical terms, 
we can better understand the model’s behaviour and provide more incisive analysis 
of the results. In pragmatic terms, modellers accustomed to formal mathematical 
models may feel more comfortable working with computational models that are 
less of a ‘black box’. 


12.3 Limitations of the Model-Based Approach 


While the development of model-based demography provides new means for 
generating demographic knowledge, practitioners should remain mindful of the 
limitations of simulation as set out in Parts I and II. As with any modelling 
methodology, agent-based methods are best used to answer research questions that 
would explicitly benefit from modelling individual behaviours and interactions. 
When modelling problems that are closely related to complex social factors, agent- 
based models may provide a more suitable platform for explanatory aims. 


12.3.1 Demographic Social Simulation 


Revisiting the agent-based model of the Anasazi Axtell et al. (2002) provides an 
example. The model was built using archaeological data, and provides a useful 
platform for exploring how the Anasazi population eventually declined. However, 
as pointed out by Janssen et al. (2009), one could argue that the model did not result 
in a substantial change in the discourse around the Anasazi’s decline. The authors’ 
replication shows that, while the model results do closely mirror archaeological data 
and provide sensible conclusions, the model does not provide much information 
beyond a comparatively simple model based on the carrying capacity of the Long 
House Valley itself: 


Within model-based archaeology two approaches can be identified: (1) detailed data-driven 
simulation models that mimic observed trends like population numbers, (2) stylized models 
that are informed by empirical data but explore a broader domain of possible social- 
ecological systems. Which approach is the most appropriate depends on the research 
question at hand. The fact that the Long House Valley abandonment can not be explained 
by environmental factors is demonstrated by the original Artificial Anasazi, but it could 
also be explained by calculating the carrying capacity of the valley. A more comprehensive 
question like whether exchange networks increase the resilience of settlements in the US 
south west may need to be addressed by a series of models, including stylized models that 
simulate various possible landscapes. (Janssen 2009, para. 5.4) 


228 12 Conclusions 


In model-based archaeology, as in social sciences more broadly, the data- 
driven social simulation approach and the abstract, theory-driven systems sociology 
approaches are in evidence. In the case of the Anasazi simulation, one might argue 
that the model may have been more powerful than was actually demanded by the 
research question. We might propose as well that the model in this case served a 
useful role by confirming a result through a more detailed modelling approach, as 
well as providing a useful example of model-based demography in practice. 

The choice of whether to apply an agent-based model to a particular question is 
not always going to be straightforward, and there is often the chance that a model 
may be overkill for certain types of research questions. However, as in the case 
of the Long House Valley model, the model can serve additional purposes beyond 
testing a hypothesis, by providing a test case for a particular approach, generating 
new questions that can be examined with a refinement of the model, or by directing 
future data collection. 


12.3.2 Demographic Systems Sociology 


The Wedding Ring and Wedding Doughnut provide a useful example of the more 
abstract side of agent-based demography. These models are more generalised and 
abstract in their approach, in contrast to the Anasazi model. The Wedding Ring 
eschews empirical data entirely, building a model focused on testing a particular 
theory about the influence of social pressure on partnership formation timing (Billari 
et al. 2007). The approach is more in line with a systems sociology approach, 
in which we are examining the impact of social factors on a population-level 
phenomena without reliance on empirical data. The theoretical focus of the model 
is acknowledged from the outset, and as a consequence the model provides both 
a useful exploration of theory and an influential proof-of-concept for demographic 
models with similar research aims. 

The Wedding Doughnut (Bijak et al. 2013; Silverman et al. 2013a) takes the 
foundations of the Wedding Ring and takes them in a more empirical direction. 
Empirical data is used to generate the initial population and to drive the patterns 
of mortality and fertility amongst the agents. A simple model of health status is 
added to illustrate how simple models can still be relevant for the study of social 
policy. The model does not fully make the leap into social simulation, however, as 
the authors are not aiming for specific point predictions regarding social care need 
or UK population change; instead, the model provides an example of the integration 
of statistical demography and agent-based approaches. 

In an empirically-driven discipline like demography, models like these stand out 
as a more theoretically-driven approach. This can easily lead to misunderstandings, 
as the results may seem to lack relevance to real-world population change, and too 
ill-informed by population data to provide demographic insight. However, as noted 
by Courgeau and Franck (2007), a demography which exists to operate only on 
successive sets of data using identical methods is not a field which is progressing 


12.4 The Future of Model-Based Demography 229 


as a scientific practice. Model-based demography can provide a means to expand 
the theoretical innovation of demography, as illustrated by the Wedding Ring and 
subsequent extensions. 

In practice, the makers of such models should be mindful of their theoretical 
backstories, and ensure that the assumptions underlying their construction are 
clearly delineated from the start. Setting out the aims and purpose of a more abstract 
model will ensure that the results are properly placed into context by readers, and 
alleviate potential misunderstandings due to misapprehension of the model’s scope 
and intended impact. This will also help to ensure that comparisons between models 
and demographic approaches will be made on a like-for-like basis. Demographic 
systems sociology models will not evaluate well when compared against statistical 
models of a particular population, for example, given that the simulation in that case 
is not aiming for theoretical relevance in the first place — but if the intended scope 
of the model is not laid out from the outset, that may not be clear and could lead to 
a negative evaluation of the methodology by the community. 


12.4 The Future of Model-Based Demography 


Model-based demography clearly has potential as an approach to certain types of 
demographic problems, both focused empirical questions and broader, theoretical 
concerns. However, the unique characteristics of agent-based approaches in partic- 
ular suggest some particular avenues where this approach would be most fruitful. 

The demographic extension of the Linked Lives model discussed in Chap. 11 
(Noble et al. 2012; Silverman et al. 2013b) demonstrates the potential of agent-based 
demographic models for the study of major social policy concerns. Demography 
has a long history of empirical relevance, and is frequently used by policy-makers 
to guide their decisions (Xie 2000). The Linked Lives model aims to leverage this 
strength by combining statistical demographic elements with a detailed model of 
the supply and demand of social care in an ageing UK society, a problem receiving 
a great deal of focus in the UK political context at present. 

The simulation is built around a simplified version of UK geography, in which 
individual agents live, form partnerships, migrate, and provide care for loved ones. 
The original Linked Lives model (Noble et al. 2012) focused on the implementation 
and demonstration of the model as a useful platform for the examination of 
the cost of social care; the subsequent demographic extension (Silverman et al. 
2013b) incorporated UK census data and demographic projections of mortality and 
fertility to enhance the realism of the model. This combination produces population 
dynamics that accurately reflect demographic projections of the UK population. 

More importantly, however, the extended model demonstrates how a model of 
this type can provide unique insights into policy-relevant problems that benefit both 
from demographic expertise and the modelling of complex interactions facilitated 
by an agent-based approach. For example, the model illustrates that a policy change 
that might at first blush seem largely irrelevant to the cost of social care — in 
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this case, increasing the retirement age — has a significant impact. The presence 
of retired carers actually accounts for a surprisingly large amount of the informal 
social care being provided in the model, and as a consequence, keeping older 
potential carers in work longer can backfire when the retirement age is raised too 
high (Silverman et al. 2013b). This result anticipates the later analysis by Age UK, 
which highlights the significant cost savings to society provided by these selfless 
older citizens (Age UK 2016). The use of Gaussian process emulators to confirm 
the impact of the retirement age parameter further increases our confidence in this 
result, allowing us to peer deeper into the workings of the model and determine key 
parameters that may be of particular importance to policy-makers concerned with 
social care. 

While the model remains more of a proof-of-concept, and does not claim to 
provide specific and solid predictions for the future of UK social care, it does 
provide a useful exemplar for future excursions into policy-relevant model-based 
demography. Despite the relative lack of data compared to data-rich microsimula- 
tions, the model is able to provide significant insight into the dynamics of social 
care supply and demand. The incorporation of UK demographic data shows that 
simulations can be linked closely with population data in a relatively straightforward 
way. Finally, the use of uncertainty quantification in the form of emulators allows 
us to more thoroughly explore the simulation’s parameter space, and in the process 
generate scenarios that help us examine possible futures under a wide variety of 
possible policy shifts. 

Thus we may imagine a future for model-based demography in which the 
approach becomes a trusted tool for the study of empirical questions of population 
change where social factors are of particular relevance, but also where it flourishes 
particularly when applied to systems sociological models driven by social theory, 
and policy-relevant models aimed at the generation and exploration of scenarios. 
The latter case offers another area of growth for demography, where the implemen- 
tation of models combining population data and complex agent behaviour allows 
us to create ‘policy sandboxes’ where future population trends can be studied 
under a variety of possible futures. Interacting with models of this type can help 
policy-makers to spot potential spillover effects of policy changes before real- 
world implementation, and assist them in the creation of evidence-based policy 
informed by real-world population data and a scientific approach to the modelling 
of populations. 


12.5 Model-Based Demography as an Exemplar for Social 
Science Modelling 


In Parts I and II, we examined the methodological difficulties inherent in the use of 
agent-based modelling for the social sciences. By bringing together methodological 
analyses from Alife, social simulation, population biology, and political science, 
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we established the importance of a theoretical backstory for any given modelling 
enterprise. These backstories delineate the assumptions on which our models 
operate, the intended scope of the model, and the level of artificiality we ascribe 
to the model and its results. 

In practice, however, addressing these concerns in detail every time we develop 
a model seems redundant at best, even a waste of time at worst. The practice of 
modelling often requires an iterative approach, in which previous simulations are 
extended in various ways, tested and at times discarded, and as a consequence each 
instance of the model could approach each of these elements slightly differently, 
even if the overall research aims stay largely identical. 


12.5.1 The Advantages of Developing a Framework 


In the case of model-based demography, we are able to alleviate this additional 
explanatory burden somewhat by developing a widely applicable methodological 
framework — a paradigm which seeks to justify the general practice of modelling 
population change in this way from the outset. Under this methodological paradigm, 
modelling is focused on a classical scientific approach, informed by data and tasked 
with studying the social factors underlying the processes generating population 
change. As model-based demographers we seek the integration of demography’s 
greatest strengths — rich population data and a centuries-long history of statistical 
expertise and innovation — with simulation’s ability to surpass some troublesome 
epistemological limits of demographic knowledge (Courgeau et al. 2017). By 
extending the concept of the statistical individual to the simulated individual, 
we establish model-based demography as a descendant of the methodological 
tradition of the discipline, and enable a generation of researchers to embrace a new 
technique without overly troubling themselves with the finer points of Artificial! 
and Artificial’. 

The advantage of this kind of approach is significant. Establishing the theoretical 
backstory in advance as a methodological addition to the field, or as a sub-field, 
allows us to approach each new model identifying as ‘model-based demography’ 
with a pre-existing knowledge of the likely scope and intent of that model. Where 
models depart from the core concepts of model-based demography, this can be 
established when documenting the model by making reference to this paradigm. 
We are able to spend more time constructing and validating models, confident that 
our intentions will be understood by the community at large without excessive 
explanation. 

Additional complexities do come into play, however, when we reach the stage 
of analysing the results of our complex demographic models. If the advantages of 
model-based demography are to be truly realised, then methods which enable us to 
understand the impact of model parameters on population-level phenomena should 
continue to be refined. Uncertainty quantification methods like Gaussian process 
emulators provide a useful starting point here, and if model-based demographers 
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begin to embrace these techniques then it is likely we will see continued refinements 
in the future as we begin to adapt them to the particular case of agent-based social 
simulations. 

However, the development of a backstory remains important when working 
with demographic systems sociological models. Abstract models are simpler in 
their construction, generally speaking, but are not necessarily simpler in their 
implications, as we saw with Schelling’s residential segregation model (Schelling 
1971, 1978). Establishing the scope and artificiality of a model is significant, as 
simplistic models can easily be misconstrued as making overly ambitious claims 
otherwise. 


12.5.2 Model-Based Social Sciences 


The example of model-based demography illustrates the advantages of developing 
a methodological paradigm as a kind of collective theoretical backstory for an 
approach to simulation with a specific discipline. The specific case of demography 
somewhat lends itself to this way of doing things, however; demography boasts a 
lengthy history and a notable ability to absorb and refine a wide variety of statistical 
approaches. In that context, establishing another methodological framework to 
underwrite the use of simulation seems an appropriate way to situate simulation 
as a tool worthy of the same respect as statistical modelling. 

In other areas of social science, however, the range of methodologies in use 
can be much wider. We see researchers gathering data qualitatively via interviews 
or surveys, or others analysing texts or artwork, or studying the geographical 
distribution of people, artefacts and customs. Many of these disparate methods can 
provide useful knowledge that can be utilised in simulation (Gilbert and Troitzsch 
2005), but this does not mean in turn that the simulations will be considered 
trustworthy or appropriate tools to those same researchers. 

In this context we cannot simply write variations of the model-based demography 
framework as model-based social science and expect them to provide an appropriate 
theoretical backstory for such a broad range of research questions and methods. 
However, model-based demography does demonstrate a process which can be more 
transferrable. Model-based demography as a framework addresses core questions 
that are just as salient elsewhere: 


. What are the key questions asked by our discipline? 

. What is the main unit of analysis in our discipline? 

. What are the main epistemological limits within our discipline? 

. Which of these limits can be addressed in some way using simulation? 
. How can our analyses inform a simulation process? 


AWN Re 


Model-based demography suggests that simulation efforts in other areas of social 
science would benefit from a concerted effort to address these core questions, and 
in the process situate the approach clearly within a disciplinary context. Doing so 
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not only alleviates some of the difficulties outlined above, but it provides a common 
backstory which also clarifies and communicates the aims of the work to others 
outside the discipline. This in turn allows for easier collaboration between social 
scientists and simulation practitioners, and eases the time-consuming process of 
developing a common language in simulation collaborations, which can inhibit 
progress significantly in new simulation ventures. Collaboration across disciplines 
also becomes easier, as each member of the collaboration would have a clear 
statement in hand of the methodological aims and limitations of the work to come. 

In a sense, perhaps, we would benefit from developing modelling manifestos 
of sorts. Rather than individual justifications of each model, establishing a united 
front through which we can embark on journeys into simulation allows us to get on 
with development and implementation using a common framework. Some models, 
particularly those of a more abstract, systems-sociological bent, will need to pay 
more attention to individual statements of scope and purpose, but given the more 
theory-driven and explanatory nature of such models this is naturally part of such 
an enterprise anyway. Developing such ‘manifestos’ will certainly spawn its own 
protracted arguments and divisions, naturally — we are academics, after all. That 
being said, with the splintering of so many disciplines into sub-disciplines and sub- 
sub-disciplines, we might benefit from occasional forays into self-reflection on the 
goals and limitations of our work and how it relates to our colleagues elsewhere in 
our own disciplines and neighbouring ones as well. 


12.6 A Manifesto for Social Science Modelling? 


By most measures this volume makes for a rather unwieldy manifesto for social 
modelling — the word count is excessive; it covers far too many disciplines; and 
many of the conclusions are highly malleable depending on the reader’s own 
disciplinary background and research convictions. Fortunately, this volume is not 
intended to fill that role; specialists within the varied specialisms of social science 
are far better equipped to handle the task of establishing approaches to modelling in 
their particular context (see, e.g., Conte et al. 2012). 

This volume set out to expose and discuss the challenges faced by simulation 
modellers, starting from the earliest pioneers in simulation (Schelling, Langton, 
and the rest) and moving toward the current growing interest in models of human 
sociality in many different flavours. By bringing together insights drawn from Alife, 
population biology, social simulation, and demography, we are able to develop a 
better understanding of the power and the limits of simulation when applied to 
the social sciences. The development of model-based demography shows us how 
simulation can be investigated, applied, and refined for a particular social science 
context. 

Ultimately, the further advancement of social modelling will still require sig- 
nificant work, both theoretical and practical. Conversations will need to be started 
between colleagues who hardly understand one another; conceptual chasms will 
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need to be bridged; and social scientists will need to work with programmers and 
computer scientists who may have very different views of the world. Hopefully the 
discussions brought forth in this volume might make those discussions somewhat 
easier, the bridges a bit shorter and easier to construct, and the gaps in practical 
and disciplinary knowledge between social scientists and computer scientists less 
insurmountable. If it facilitates some heated debates over the writing of some 
modelling manifestos, then so much the better. 
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